From: Benjamin Mako Hill Date: Thu, 8 Aug 2013 10:42:38 +0000 (-0400) Subject: committed version from plane (should be good to go) X-Git-Url: https://projects.mako.cc/source/state_of_wikimedia_research_2013/commitdiff_plain/ba6cb65ce89c04f5222e74a82c93e4f030dd16d3 committed version from plane (should be good to go) --- diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..7cb23ab --- /dev/null +++ b/.gitignore @@ -0,0 +1,13 @@ +auto/* +vc +20130809-wikimania_research.aux +20130809-wikimania_research.log +20130809-wikimania_research.nav +20130809-wikimania_research.out +20130809-wikimania_research.pdf +20130809-wikimania_research.pdfpc +20130809-wikimania_research.snm +20130809-wikimania_research.toc +notes.config +figures/.Rhistory +papers/* diff --git a/20130809-wikimania_research.tex b/20130809-wikimania_research.tex index 3dddda9..58bb079 100644 --- a/20130809-wikimania_research.tex +++ b/20130809-wikimania_research.tex @@ -55,6 +55,8 @@ } } +% create an empty quotetxt so we can reuse it +\newcommand{\quotetxt}{} % add function to stop numbering appendix slides \newcommand{\backupbegin}{ @@ -163,7 +165,12 @@ \let\olditemize\itemize \renewcommand\itemize{\olditemize\itemsep-1pt} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{Introduction} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + %% SLIDE: Title Slide +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame}[plain] \begin{tikzpicture} @@ -193,8 +200,530 @@ \input{vc} \tikz[overlay,shift=(current page.south west)]{\node [xshift=5.6em,yshift=0.5em]{\colorbox{makopurple1}{\color{white} \tt \smaller \smaller \smaller revision:\ \VCRevision\ (\VCDateTEX)}};} + \note{I've been doing this for many years. I started in 2008 and + skipped one year, I think. + + This began as an excuse for me to make sure I was up to date on + Wikimedia Research.} + +\end{frame} + +%% SLIDE: Anecdote from Wikimania 2008 +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\renewcommand{\quotetxt}{``This talk will try to [provide] a quick + tour – a literature review in the scholarly parlance – of the last + year's academic landscape around Wikimedia and its projects geared + at non-academic editors and readers. It will try to categorize, + distill, and describe, from a birds eye view, the academic landscape + as it is shaping up around + our project.''\\ + \hfill – \e{From my Wikimania 2008 Submission}} +\begin{frame} + + {\smaller \quotetxt} + + \pause + \includegraphics[width=\textwidth]{figures/google_scholar_result.png} + + \pause + \tikz{\draw (current page.center) [xshift=-2.1cm, yshift=0.9cm, color=red] + ellipse (1.5cm and 0.5cm);} + + \note<1>{Back in Wikimania 2008, I set out to run a session at + Wikimania that would provide a comprehensive literature review of + articles in Wikipedia published in the last year. + + \begin{quote} + \quotetxt + \end{quote} + + Then, about two weeks before Wikimania, I did the scholar search + so I could build the literature.} + + \note<2->{I tried to import the whole list into Zotero and managed + to get banned for abusing the Google Scholar because they thought + that no human being could realistically consume the amount of + material published on Wikipedia that year. + + So anyway, I had a 45 minute talk so it worked out to 3.45 seconds + to per paper... + + And believe it or, this year is even bigger. + + And my talk is even shorter.} + +\end{frame} + +%% SLIDE: Citations Per Year +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame} + + \includegraphics[width=\textwidth]{figures/citations_by_year.pdf} + + \centering + + {\smaller \emph{Number of citation, per year, with the term + “wikipedia” in the title.\\ + (Source: Google scholar results. Accessed: 2013-08-06)}} + + \note{Academics have written \e{a lot} of papers about + Wikipedia. There are more than 500 papers published about + Wikipedia each year and although we've reached a peak, it's not + really slowing. + + We're on track this year to meet or surpass that.} + +\end{frame} + +% %% SLIDE: breakdown by time? +% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +% \begin{frame} + +% \includegraphics[width=\textwidth]{figures/wikipeda_citations_bytime.png} +% \end{frame} + + +%% SLIDE: My Scope Conditions +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame} + + \includegraphics[width=\textwidth]{figures/multiple_issues.png} + + \larger \larger + In selecting papers for this session, the goal is always to choose + examples of work that: + + \begin{itemize} + \larger \larger + \item Represent \e{important themes} from Wikipedia in the last year. + \item Research that is likely to be of \e{interest} to Wikimedians. + \item Research by people who are \e{not at Wikimania}. + \end{itemize} + + Within these goals, the selections are \e{incomplete}, and \e{wrong}. + + \note{This is my disclaimer slide...} +\end{frame} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{Paper Summaries} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +\subsection{Wikipedia in Context} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%% SLIDE: Reagle and Loveland Citation +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}{Wikipedia in Historical Context} + + \larger \larger Loveland, Jeff, and Joseph Reagle. “Wikipedia and + Encyclopedic Production.” \emph{New Media \& Society} + (2013). DOI:10.1177/1461444812470428. + + \note{Jeff Loveland is a historian of encyclopedias. Joseph Reagle + is a media studies scholar who wrote the first book length + academic treatment of Wikipedia. + + Loveland heard about Reagle's book through an article in the + Signpost but felt it was weak on history. So, they got together + and put together a great piece of work that places Wikipedia into + historical context.} +\end{frame} + +%% SLIDE: Reagle and Loveland Overview +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}{Wikipedia in Historical Context} + + \larger \larger \larger Loveland and Reagle cite three modes + of encyclopedia production: + + \begin{itemize} + \larger \larger \larger + \item Compulsive collection + \item Stigmertic accumluative + \item Corporate production + \end{itemize} + + In each case, they see a connection between Wikipedia and methods of + the past. + + \note<1>{The authors identify three historical methods through which + encyclopedias were written and they suggest that, in different + ways, each plays a role in Wikipedia: + + \begin{itemize} + \item \e{Compulsive collection} were people who were individually + driven to collect information. Think Pliny the Elder. And then + think Wikipediaholics and WikiBreak enforcing software. + \item \e{Stigmergic accumulation} references the `stigmergy' is a + word form Zoology that describes how wasps build nests and + references accumulation. In the past, this meant piracy and + building off of others. In Wikipedia, it means revision, + incorporation of other sources, and more. + \item \e{Corporate productin} means working together with many + other people. Diderot took advantage of at least 140 different + authors. Think the OED collecting information from + others. Wikipedia of course uses a similar model. + \end{itemize} + + In each case, they think that Wikipedia's model is not a total + break from the past in the way many people talk abou it.} + + \note<2>{Now my own bias as a reseacher is to look to more + quantitative or easy to apply work. + + \e{Takeaway:} But I think is a great example how much of the more humanities + focused work on Wikipedia can do a wonderful job of providing us + context and a better way to think about and talk about what we're + doing.} +\end{frame} + + +\subsection{Wikipedia as Data Source} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%% SLIDE: Citation +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}{Wikipedia as Data Source} + + \larger \larger + + Sérasset, Giles. “Dbnary: Wiktionary as a LMF Based Multilingual RDF + Network.” In \emph{Proceedings of the Eight International Conference on + Language Resources and Evaluation}, 2012. + + \begin{center} + \visible<2>{\url{http://dbnary.forge.imag.fr/}} + \end{center} + + \note<1>{There's a whole genre of paper that is about Wikipedia only + in that is uses WP as its dataset. This might even be a + \e{majority} of all papers published on Wikipedia. + + This paper up here, on a project called ``Dbnary'', is attempt to + build a \e{lexical network} out of Wiktionary data. Essentially, + they are using Wiktionary as a network of words and their + relationships -- including definitions, translations, synonyms, + antonyms, etc. -- in different languages, often connected through + common etymologies. + + Lexical networks are are essential to a whole family of + computerized natural language processing and a variety of + linguistic projects. + + What I like about what Sérraseset did was that he created not only + use it as a dataset but really did a bunch of work to make + Wiktionary more useful to other resources.} + + \note<2>{The researcher has created an open source tool – available + at the URL above. + + And anybody can use this tool, along with the dumps as published + by WMF, to produce their own, on their computers, is about 5 + minutes. + + The paper also contains a list of challenges that Wiktionary + contributors might be able to use to extract data more effectively + in the future. + + \e{Takeaway:} I think that this paper suggests, like a lot of + simliar work, how Wikipedia's effect is broader than just what + comes through viewership on the web. And that there are important + ways we might be able to work with researchers like this to become + more effective.} + +\end{frame} + +\subsection{Wikipedia and Quality} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%% SLIDE: Wikipedia and Quality Citation +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}{Wikipedia and Quality} + + \larger \larger + + Volsky, Peter G., Cristina M. Baldassari, Sirisha Mushti, and Craig + S. Derkay. ``Quality of Internet Information in Pediatric + Otolaryngology: A Comparison of Three Most Referenced Websites.'' + \emph{International Journal of Pediatric Otorhinolaryngology} 76, + no. 9 (September 2012): 1312–1316. DOI:10.1016/j.ijporl.2012.05.026. + + \note{There is little industry of articles designed to evaluate + Wikipedia's quality. There are literally dozens of these each + year. And one that thing that frustrates me is that its very rare + that the people doing these coordinate with Wikipedia or that + Wikipedians systematically reach out to the people doing these to + learn. + + This is an example of one from pediatric otolayrnology. That is, + the study of dieases of the ear, nose, and throat -- in children.} + +\end{frame} + +%% SLIDE: Results +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}{Wikipedia and Quality: Evaluation of Otolaryngology Articles} + \smaller \smaller + \begin{columns} + \column{0.53\textwidth} + \centering + + \includegraphics[width=0.6\textwidth]{figures/oto-content_accuracy.png} + + Accuracy as scored for content against a rubric\\ + developed from otolaryngology textbooks. + + \bigskip + + \includegraphics[width=0.6\textwidth]{figures/oto-errors_omissions.png} + + Mean numbers of errors and omissions. + + \column{0.47\textwidth} + \centering + + \includegraphics[width=0.6\textwidth]{figures/oto_reading_level.png} + + Composite score for user interface. + + \bigskip + + \includegraphics[width=0.6\textwidth]{figures/oto-user_interface.png} + + Flesch–Kinkaid Reading Level. + + \end{columns} + + \bigskip + + {\centering + {\larger WK=Wikipedia; ML=MedLinePlus; EM=eMedicine.} + + } + + \note{Like many of these studies, this study cmpares Wikipedia to + other sites. In this case, eMedicne, and Medicine Plus. They used + a series of textbooks and experts to evaluate the the content + errors and they used some standard systems to evaluate usability + and reading level. + + They find that Wikipedia has the most errors, the least accuracy, + aa medium reading level. But similar in most cases to MedLinePlus. + + And Wikipedia had a rather good user interface compared to the + others. + + I'm not sure what that says about the others user interface. + + \e{Takeaway:} We need to be better about getting these datsets and + helping integrate these into improving the encyclopedia.} \end{frame} +\subsection{Perception of Quality} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%% SLIDE: Perception of Quality +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}{Perception of Quality} + + \larger \larger Towne, W. Ben, Aniket Kittur, Peter Kinnaird, and + James Herbsleb. “Your Process Is Showing: Controversy Management and + Perceived Quality in Wikipedia.” In \emph{Proceedings of the 2013 + Conference on Computer Supported Cooperative Work}, 1059–1068. CSCW + ’13. New York, NY, USA: ACM, 2013. DI:10.1145/2441776.2441896. + + \note{A group at Carnegie Mellon put together a really nice piece + that tried to surface Wikipedia's talk pages. Now, as many of you + will know intuitive, a majority of Wikipededia's work happens on + talk pages are invisible to many users. What would happen if we + made this more visible?} +\end{frame} + +\begin{frame}{Perception of Quality: Towne et al.} + + \larger \larger + ``Laws, like sausages, cease to inspire respect in proportion + as we know how they are made.''\\ + \hfill -- John G. Saxe, + + \begin{itemize} + \larger \larger + \item<2-> Discussion $\Rightarrow$ Lower Ratings + \item<3-> Unresolved conflict $\Rightarrow$ Even lower ratings + \item<4-> Discussion $\Rightarrow$ Higher reported preception of + Wikipedia and article! + \end{itemize} + + \note{The goal was to test this theory in Wikipedia. + + An experiment, on Mechanical Turk, to show people Wikipedia + articles and also to show them the talk pages. Then then asked + people to rate the articles, and their perception of the article + and of Wikipedia. + + \begin{itemize} + \item When discussion is shown, quality rating were significantly lower. + \item When discussion involving conflict was displayed, article + quality ratings were even lower. + \item If the editors involved in the conflict resolved it + through a positive collaboration approach, the negative + effects of conflict disappeared. + \item Participants reported that reading the discussion raised + their perceptions of both the article’s quality and Wikipedia + in general. (i.e., they were not aware of the rating-lowering + effect of the discussion, and generally.) + \end{itemize} + + \e{Takeaway:} There's a deep and interesting tradeoff that cuts to + the core of Wikimedia's two missions to empower folks by getting + involved in the process to display material. This kind of work + explores big important questions at the heart of the foundations + work.} + +\end{frame} + +\subsection{Tool Building for Wikipedians} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%% SLIDE: Tool Building for Wikipedians +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}{Tool Building for Wikipedians} + + \larger \larger Solorio, Thamar, Ragib Hasan, and Mainul Mizan. ``A + Case Study of Sockpuppet Detection in Wikipedia.'' In + \emph{Proceedings of the Workshop on Language in Social Media}, + 59–68. Atlanta, Georgia, USA: Association for Computational + Linguistics, 2013. + + \note<1>{This is paper from a computational linguistics conference. And + they set out to create a method to identify sockpuppets in + Wikipedia. + + There's a little academic industry designed to detect authorship + across texts and alias. But one problem that literature has is + that they almost no data of people \e{trying} to hide their + identity where the identity was later confirmed. + + Wikipedia has no such problem. There were more than 2,700 cases of + suspected sock-puppeting in Wikipedia in 2012 alone.} + + \note<2>{They use a database of confirmed (with checkuser) and rejected + cases of sockpuppeting to train a machine learning based approach + to classify edits. + + The system achieved an accuracy of 68.83\% in the tested cases. + + This is not very good because simply always confirming the + suspected sockpuppet abuse would have achieved 53.24\% accuracy. + After adding features based on the user's edit frequency by time + of day and day of the week, it achieved 84.04\% confidence. + + The authors have ideas of creating a system that could run in the + background and detect sockpuppets. But even if that never happens, + community members have done similar work in the past. And this + represents a set of tools and techniques from which the community + could directly benefit. + + \e{Takeaway:} We need to get better about working with all the + people, like this, building tools for our communities.} + + +\end{frame} + + +\subsection{Effects of Feedback} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%% SLIDE: Effects of Feedback Citation +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}{Effects of Feedback} + + \larger \larger Zhu, Haiyi, Amy Zhang, Jiping He, Robert Kraut, and Aniket + Kittur. ``Effects of Peer Feedback on Contribution: A Field + Experiment in Wikipedia.'' In \emph{Proceedings of the SIGCHI + Conference on Human Factors in Computing Systems}. Paris, France: + ACM, 2013. + + \note{There have been a whole bunch of studies which have looked at + the effects of feedback on contribution to Wikipedia. Reverts, + welcome messages, et. And they have shown a series of effects. + + But one concern with this work is that it is not causal. People + who receive negative messages are often behaving differently than + people who do not. + + This reflects a real experiment, done in Wikipedia, where + different types of feedback were randomly assigned. + + In August-November 2011, they left feedback for 703 creators of + new articles in Wikipedia after at least two days and making sure + the article had a certain amount of content and had not been + tagged for speedy deletion.} + +\end{frame} + +%% SLIDE: Effects of Feedback Figures +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}{Effects of Feedback: Zhu et al.} + \centering + + \includegraphics[height=0.85\textheight]{figures/shared_leadership-figures.pdf} + + \note{They left four kinds of feedback: positive, negative, + directive, and social. + + And they were interested in both the effect on editing in the new + article they mention and on general editing on Wikipedia. + + Feedback had no effect at all on experienced contributors. At + all. This was surpising to the folks running the study but maybe + not to the folks in this room. + + In newbies, they found that negative feedback and directive + feedback had a positive effect on editing in the focal article and + positive feedback had a effect on general editing (but not the + article in question). And they found no other effects. + + \e{Takeaway:} We should learn from and improve our processes based + on studies like these. We should work with researchers to do more + experiments. There are important ethical implications. There was a + long section of the paper about talking to the research ctte and + others.} + +\end{frame} + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{Conclusion} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%% SLIDE: Other Resources +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}{More Resources} + + \begin{itemize} + \larger \larger \larger + \item \e{Wikimedia Research Newsletter} [[:meta:Research:Newsletter]] + \item \e{WikiSym} (Last week in Hong Kong!) + \item \e{WikiPapers Repository} [http://wikipapers.referata.com] + \item \e{Much More} + \end{itemize} + + \note{Those are my six postcards. + + There has been just tons and tons of work in this area. Trying to + talk about this in 20 minutes strikes me as increasingly crazy + every year I try to do it. + + The most important source, now going for a couple years, is the + Wikimedia Research Newsletter which is published monthly in the + signpost. + + But there are other resources as well. And I encourage you to get + involved.} + +\end{frame} \end{document} diff --git a/figures/Wikipedia publications - Data.csv b/figures/Wikipedia publications - Data.csv new file mode 100644 index 0000000..0dde3db --- /dev/null +++ b/figures/Wikipedia publications - Data.csv @@ -0,0 +1,15 @@ +,wikipedia,corpus,quality,reputation,gender,collaboration,education,,,network,model +2001,18,1,0,0,0,0,0,,,0,0 +2002,8,0,0,0,0,0,0,,,0,0 +2003,12,0,0,0,0,0,0,,,0,0 +2004,47,0,0,0,0,4,0,,,0,0 +2005,213,4,5,1,0,2,5,,,3,0 +2006,354,10,6,7,0,11,7,,,9,0 +2007,570,26,10,6,0,13,7,,,4,5 +2008,634,44,12,10,0,13,16,,,8,8 +2009,721,53,16,8,2,16,19,,,9,13 +2010,754,79,12,10,4,12,20,,,9,12 +2011,692,59,15,18,4,41,29,,,22,15 +2012,674,76,19,7,4,22,25,,,24,15 +2013,435,49,12,3,3,22,29,,,15,14 +2013 to date,255,29,7,2,2,13,17,,,9,8 \ No newline at end of file diff --git a/figures/citations_by_year.pdf b/figures/citations_by_year.pdf new file mode 100644 index 0000000..37ff491 Binary files /dev/null and b/figures/citations_by_year.pdf differ diff --git a/figures/cite_graph.R b/figures/cite_graph.R new file mode 100644 index 0000000..ce88e41 --- /dev/null +++ b/figures/cite_graph.R @@ -0,0 +1,36 @@ +# the last line is projected based on citations to the end of october +# (almost certainly conservative) +library(ggplot2) + +d <- read.csv("wikipedia_citations.txt",header=F) +colnames(d) <- c("year", "citations") + +# print the total number of citations +sum(d$citations) + +# generate and print a graph +# p <- qplot(year, citations, data=d) + +# geom_line(colour="blue") + geom_point(colour="blue") + +p <- qplot(factor(year), citations, data=d, geom="bar", fill=I("darkblue")) +p <- p + scale_x_discrete("Year") + scale_y_continuous("Number of Papers") + +pdf("citations_by_year.pdf", width=7.5, height=5.3) +print(p) +dev.off() + +## data from dario +##########################################################3 + +# import data from dario +d <- read.csv("Wikipedia publications - Data.csv") + +# clean up the dates +colnames(d)[1] <- "date" +d <- d[,c(-9,-10)] +d <- d[!d$date == "2013 to date",] +d$date <- as.factor(d$date) + +library(reshape) +qplot(date, value, data=melt(d), group=variable, geom="line") + + aes(colour=variable) + scale_y_log10() diff --git a/figures/google_scholar_result.png b/figures/google_scholar_result.png new file mode 100644 index 0000000..75ac820 Binary files /dev/null and b/figures/google_scholar_result.png differ diff --git a/figures/oto-content_accuracy.png b/figures/oto-content_accuracy.png new file mode 100644 index 0000000..33343d0 Binary files /dev/null and b/figures/oto-content_accuracy.png differ diff --git a/figures/oto-errors_omissions.png b/figures/oto-errors_omissions.png new file mode 100644 index 0000000..a050a33 Binary files /dev/null and b/figures/oto-errors_omissions.png differ diff --git a/figures/oto-user_interface.png b/figures/oto-user_interface.png new file mode 100644 index 0000000..2e63a43 Binary files /dev/null and b/figures/oto-user_interface.png differ diff --git a/figures/oto_reading_level.png b/figures/oto_reading_level.png new file mode 100644 index 0000000..5232c56 Binary files /dev/null and b/figures/oto_reading_level.png differ diff --git a/figures/shared_leadership-figures.pdf b/figures/shared_leadership-figures.pdf new file mode 100644 index 0000000..11760e0 Binary files /dev/null and b/figures/shared_leadership-figures.pdf differ diff --git a/figures/wikipedia_citations.txt b/figures/wikipedia_citations.txt new file mode 100644 index 0000000..89d95ad --- /dev/null +++ b/figures/wikipedia_citations.txt @@ -0,0 +1,12 @@ +2001,18 +2002,8 +2003,12 +2004,47 +2005,213 +2006,345 +2007,471 +2008,634 +2009,720 +2010,754 +2011,692 +2012,674 diff --git a/outline.org b/outline.org new file mode 100644 index 0000000..fe21c4c --- /dev/null +++ b/outline.org @@ -0,0 +1,81 @@ +* DONE Wikipedia in Context +** DONE Reagle and Loveland on "Wikipedia and encyclopedic production" +* How Wikipedia is Organized +** Butler et al: Eyes on the prize: officially sanctioned rule breaking in mass collaboration systems +* Motivating Editors +** Haiyi on Effects of Peer Feedback on Contribution: A Field Experiment in Wikipedia + +One of the most significant challenges for many online communities is +increasing members' contributions over time. Prior studies on peer +feedback in online communities have suggested its impact on +contribution, but have been limited by their correlational nature. In +this paper, we conducted a field experiment on Wikipedia to test the +effects of different feedback types (positive feedback, negative +feedback, directive feedback, and social feedback) on members' +contribution. Our results characterize the effects of different +feedback types, and suggest trade-offs in the effects of feedback +between the focal task and general motivation, as well as differences +in how newcomers and experienced editors respond to peer +feedback. This research provides insights into the mechanisms +underlying peer feedback in online communities and practical guidance +to design more effective peer feedback systems. + +* DONE Tool Development for Wikipedia +** DONE A Case Study of Sockpuppet Detection in Wikipedia +* DONE Wikipedia as Data Source +** DONE Dbnary: Wiktionary as a LMF based Multilingual RDF network +* DONE Evaluating Wikipedia's Quality +** DONE Quality of Internet information in pediatric otolaryngology: A comparison of three most referenced websites +** Presence and adequacy of pharmaceutical preparations in the Spanish edition of Wikipedia +* DONE Judging Quality of Wikipedia +** DONE Your process is showing: controversy management and perceived quality in wikipedia + +Nikki et al. + +** Understanding trust formation in digital information sources: The case of Wikipedia + +An article[5] in the Journal of Information Science, titled +"Understanding trust formation in digital information sources: The +case of Wikipedia", explores the criteria used by students to evaluate +the credibility of Wikipedia articles. It contains an overview of +various earlier studies about credibility judgments of Wikipedia +articles (some of them reviewed previously in this space, example: +"Quality of featured articles doesn't always impress readers"). + +The authors asked "20 second-year undergraduate students and 30 +Master’s students" in information studies to first spend 20 minutes +reading "a copy of a two-page Wikipedia article on Generation Z, a +topic with which students were expected to have some familiarity", and +answer an open-ended question explaining how they would judge its +trustworthiness. In a subsequent part, the respondents were asked to +rank a list of factors for trustworthiness in case of "either (a) the +topic of an assignment, or (b) a minor medical condition from which +they were suffering". One of the first findings was a "low +pre-disposition to use [Wikipedia], possibly suggesting a propensity +to distrust, grounded on debates and comments on the trustworthiness +of Wikipedia" – possibly to the fact that the example article +contained an example of vandalism, a fact highlighted by several +respondents (e.g. "started off as a valid entry ... due to citations +strengthening this ... however came to the last paragraph and the +whole document was marred by the insert of 'writing articles on +Wikipedia while on amphetamines' [as purported hobby of Generation Z +members]... just feels that you can't trust anything now"). + +Among the given trustworthiness factors, the following were ranked +most highly: + + authorship, currency, references, expert recommendation and + triangulation/verification, with usefulness just below this + threshold. + +In other words, participants valued having articles that were written +by experts on the subject, that were up to date, and that they +perceived to be useful (content factors). ... Interestingly these +factors all seemed more or less equally important for both contexts, +with the exception of references, which for predictable reasons were +seen as having greater importance in the context of assignments. + +* Viewership +** "Science eight times more popular on the Spanish Wikipedia than on the English Wikipedia" +* Not Presenting +** Ayelet Oz Paper