RT @hfmuehleisen: Breaking news picture galleries based on #Wikipedia edits as @twitter feed: @mediagalleries by @tomayac! #FF
From commoncrawl.org/our-work/ of @CommonCrawl: “We strive to be transparent in all of our operations.”—Any info on the crawl strategy? Vague…
mediagalleries #BreakingNews candidate via @WikiLiveMon: fr.wikipedia.org/wiki/Kelud, pic.twitter.com/zrMYfWRTfN
~RT @mediagalleries: #BreakingNews candidate via @WikiLiveMon: http://t.co/MNfSbEEXhB, http://t.co/arm53NaVIo > LOL ⤠http://t.co/CPAOcBwAZN
@rtroncy @silviapfeiffer Not a single one… We’re working on a paper at the moment.
~RT @mediagalleries: #BreakingNews candidate via @WikiLiveMon: http://t.co/8tSNvxJJbL, http://t.co/nHwqir2yjo // #CopaLibertadores
@silviapfeiffer .srt: 390; & dynamic stuff that may be #WebVTT (/includes/assets/vtt/?f=/2013/CLIP_JBU_LR/) or not (/videos/1095/captions).
Of the 1,456 <track> (http://t.co/N5J1dMzpL1), distrib. of src attrib: matches /.(web)?vtt$/i: 66, other: 1390. @silviapfeiffer #WebVTT
Of the 1,456 <track> (http://t.co/N5J1dMzpL1), distrib. of kind: captions: 915, subtitles: 525, chapters: 2, (rest invalid). @silviapfeiffer
Stats from @hfmuehleisen’s & my recent @CommonCrawl analysis on the Winter 2013 dataset: <video>: 2,963,766; out of which w/ <track>: 1,456.