added wikipedia_namespaces.csv file
authorBenjamin Mako Hill <mako@atdot.cc>
Thu, 23 Aug 2018 23:38:01 +0000 (16:38 -0700)
committerBenjamin Mako Hill <mako@atdot.cc>
Thu, 23 Aug 2018 23:38:01 +0000 (16:38 -0700)
This file was required to run the scripts but was accidently not included.

README
wikipedia_namespaces.csv [new file with mode: 0644]

diff --git a/README b/README
index 7995e407d65ec4a6706a84af414afe83bc1ea0a8..342776d76f494ce2e78ca5a5e857a7bf48cbde99 100644 (file)
--- a/README
+++ b/README
@@ -75,6 +75,15 @@ Running the Software
 - GNU R
 - `data.table` R package available on CRAN
 
 - GNU R
 - `data.table` R package available on CRAN
 
+There is also a dependency on a file called `wikipedia_namespaces.csv`
+which is included in this repository and which is drawn from data on
+this page: https://en.wikipedia.org/wiki/Wikipedia:Namespace
+
+This file is taken from English Wikipedia in 2015. If you are working
+with different wikis or with an updated dump, you will likely to need
+to update this file.
+
+
 1. Download Dumps
 ==========================
 
 1. Download Dumps
 ==========================
 
diff --git a/wikipedia_namespaces.csv b/wikipedia_namespaces.csv
new file mode 100644 (file)
index 0000000..7474d7f
--- /dev/null
@@ -0,0 +1,39 @@
+id,name,alias
+0,,FALSE
+1,Talk,FALSE
+2,User,FALSE
+3,User talk,FALSE
+4,Wikipedia,FALSE
+5,Wikipedia talk,FALSE
+6,File,FALSE
+7,File talk,FALSE
+8,MediaWiki,FALSE
+9,MediaWiki talk,FALSE
+10,Template,FALSE
+11,Template talk,FALSE
+12,Help,FALSE
+13,Help talk,FALSE
+14,Category,FALSE
+15,Category talk,FALSE
+100,Portal,FALSE
+101,Portal talk,FALSE
+108,Book,FALSE
+109,Book talk,FALSE
+118,Draft,FALSE
+119,Draft talk,FALSE
+446,Education Program,FALSE
+447,Education Program talk,FALSE
+710,TimedText,FALSE
+711,TimedText talk,FALSE
+828,Module,FALSE
+829,Module talk,FALSE
+2600,Topic,FALSE
+-1,Special,FALSE
+-2,Media,FALSE
+4,WP,TRUE
+4,Project,TRUE
+5,WT,TRUE
+5,Project talk,TRUE
+6,Image,TRUE
+7,Image talk,TRUE
+

Benjamin Mako Hill || Want to submit a patch?