hfmuehleisen Breaking news picture galleries based on #Wikipedia edits as @Twitter feed: @mediagalleries by @tomayac! #FF
From commoncrawl.org/our-work/ of @CommonCrawl: “We strive to be transparent in all of our operations.”—Any info on the crawl strategy? Vague…

mediagalleries #BreakingNews candidate via @WikiLiveMon: fr.wikipedia.org/wiki/Kelud, pic.twitter.com/zrMYfWRTfN

~RT @mediagalleries: #BreakingNews candidate via @WikiLiveMon: sv.wikipedia.org/wiki/Alla_hj%C…, pic.twitter.com/arm53NaVIo > LOL ❤ flickr.com/photos/3913509…
@rtroncy @silviapfeiffer Not a single one… We’re working on a paper at the moment.

~RT @mediagalleries: #BreakingNews candidate via @WikiLiveMon: es.wikipedia.org/wiki/Copa_Libe…, pic.twitter.com/nHwqir2yjo // #CopaLibertadores
@silviapfeiffer .srt: 390; & dynamic stuff that may be #WebVTT (/includes/assets/vtt/?f=/2013/CLIP_JBU_LR/) or not (/videos/1095/captions).
Of the 1,456 <track> (twitter.com/tomayac/status…), distrib. of src attrib: matches /.(web)?vtt$/i: 66, other: 1390. @silviapfeiffer #WebVTT
Of the 1,456 <track> (twitter.com/tomayac/status…), distrib. of kind: captions: 915, subtitles: 525, chapters: 2, (rest invalid). @silviapfeiffer
Stats from @hfmuehleisen’s & my recent @CommonCrawl analysis on the Winter 2013 dataset: <video>: 2,963,766; out of which w/ <track>: 1,456.