11aeb8088a
ThePirateBayBridge : parsing timestamp correctly + possibility to have several list separated by ';' |
||
---|---|---|
bridges | ||
caches | ||
css | ||
formats | ||
lib | ||
vendor/simplehtmldom | ||
.gitattributes | ||
.gitignore | ||
CHANGELOG.md | ||
index.php | ||
README.md | ||
UNLICENSE |
rss-bridge
rss-bridge is a PHP project capable of generating ATOM feeds for websites which don't have one.
Supported sites/pages (main)
FlickrExplore
: Latest interesting images from FlickrGoogleSearch
: Most recent results from Google SearchGooglePlus
: Most recent posts of user timelineTwitter
: Return keyword/hashtag search or user timelineIdenti.ca
: Identica user timeline (Should be compatible with other Pump.io instances)YouTube
: YouTube user channel, playlist or searchCryptome
: Returns the most recent documents from Cryptome.orgDansTonChat
: Most recent quotes from danstonchat.comDuckDuckGo
: Most recent results from DuckDuckGo.comInstagram
: Most recent photos from an Instagram userOpenClassrooms
: Lastest tutorials from fr.openclassrooms.comPinterest
: Most recent photos from user or searchScmbBridge
: Newest stories from secouchermoinsbete.frWikipediaENLatest
: highlighted articles from Wikipedia in EnglishWikipediaFRLatest
: highlighted articles from Wikipedia in FrenchWikipediaEOLatest
: highlighted articles from Wikipedia in EsperantoBandcamp
: Returns last release from bandcamp for a tagThePirateBay
: Returns the newest indexed torrents from The Pirate Bay with keywords
Plus many other bridges to enable, thanks to the community
Output format
Output format can take several forms:
Atom
: ATOM Feed, for use in RSS/Feed readersJson
: Json, for consumption by other applications.Html
: Simple html page.Plaintext
: raw text (php object, as returned by print_r)
Screenshot
Welcome screen:
Minecraft hashtag (#Minecraft) search on Twitter, in ATOM format (as displayed by Firefox):
Requirements
- PHP 5.3
openssl
extension enabled in PHP config (php.ini
)
Enabling/Disabling bridges
By default, the script creates whitelist.txt
and adds the main bridges (see above). whitelist.txt
is ignored by git, you can edit it:
- to enable extra bridges (one bridge per line)
- to disable main bridges (remove the line)
New bridges are disabled by default, so make sure to check regularly what's new and whitelist what you want !
Author
I'm sebsauvage, webmaster of sebsauvage.net, author of Shaarli and ZeroBin.
Patch/contributors :
- Yves ASTIER (Draeli) : PHP optimizations, fixes, dynamic brigde/format list with all stuff behind and extend cache system. Mail : contact@yves-astier.com
- Mitsukarenai : Initial inspiration, collaborator
- ArthurHoaro
- BoboTiG
- Astalaseven
- qwertygc
- Djuuu
- Anadrark
- Grummfy
- Polopollo
- 16mhz
- kranack
License
Code is Public Domain.
Including PHP Simple HTML DOM Parser
under the MIT License
Technical notes
- There is a cache so that source services won't ban you even if you hammer the rss-bridge with requests. Each bridge has a different duration for the cache. The
cache
subdirectory will be automatically created. You can purge it whenever you want. - To implement a new rss-bridge, create a new class in
bridges
subdirectory. Look at existing bridges for examples and the guidelines below. For items you generate in$this->items
, onlyuri
andtitle
are mandatory in each item.timestamp
andcontent
are optional but recommended. Any additional key will be ignored by ATOM feed (but outputed to json).
Bridge guidelines
- metatags:
@name
{Name of service},@homepage
{URL to homepage},@description
,@update
{YYYY-MM-DD},@maintainer
{Github username or nickname} - scripts (eg. Javascript) must be stripped out. Make good use of
strip_tags()
andpreg_replace()
- bridge must present data within 8 seconds (adjust iterators accordingly)
- cache timeout must be fine-tuned so that each refresh can provide 1 or 2 new elements on busy periods
<audio>
and<video>
must not autoplay. Seriously.- do everything you can to extract valid timestamps. Translate formats, use API, exploit sitemap, whatever. Free the data!
- don't create duplicates. If the website runs on WordPress, use the generic WordPress bridge if possible.
- maintain efficient and well-commented code 😉
Rant
Dear so-called "social" websites.
Your catchword is "share", but you don't want us to share. You want to keep us within your walled gardens. That's why you've been removing RSS links from webpages, hiding them deep on your website, or removed RSS entirely, replacing it with crippled or demented proprietary API. FUCK YOU.
You're not social when you hamper sharing by removing RSS. You're happy to have customers creating content for your ecosystem, but you don't want this content out - a content you do not even own. Google Takeout is just a gimmick. We want our data to flow, we want RSS.
We want to share with friends, using open protocols: RSS, XMPP, whatever. Because no one wants to have your service with your applications using your API force-feeding them. Friends must be free to choose whatever software and service they want.
We are rebuilding bridges you have wilfully destroyed.
Get your shit together: Put RSS back in.