Thomas Steiner (@tomayac)

Now at @tomayac@toot.cafe

The below is an off-site archive of all tweets posted by @tomayac ever

February 14th, 2014

RT @hfmuehleisen: Breaking news picture galleries based on #Wikipedia edits as @twitter feed: @mediagalleries by @tomayac! #FF

via Echofon

From commoncrawl.org/our-work/ of @CommonCrawl: “We strive to be transparent in all of our operations.”—Any info on the crawl strategy? Vague…

via Echofon

@rtroncy @silviapfeiffer Not a single one… We’re working on a paper at the moment.

via Echofon

@silviapfeiffer .srt: 390; & dynamic stuff that may be (/includes/assets/vtt/?f=/2013/CLIP_JBU_LR/) or not (/videos/1095/captions).

via Echofon in reply to silviapfeiffer

Of the 1,456 <track> (http://t.co/N5J1dMzpL1), distrib. of src attrib: matches /.(web)?vtt$/i: 66, other: 1390. @silviapfeiffer #WebVTT

via Echofon

Of the 1,456 <track> (http://t.co/N5J1dMzpL1), distrib. of kind: captions: 915, subtitles: 525, chapters: 2, (rest invalid). @silviapfeiffer

via Echofon

Stats from @hfmuehleisen’s & my recent @CommonCrawl analysis on the Winter 2013 dataset: <video>: 2,963,766; out of which w/ <track>: 1,456.

via Echofon