extractKinja - Another backup solution [UPDATE : Initial batch import support]

[UPDATE on the 13th] It is now possible to batch import the articles listed on one page such as the mainpage of a blog or the “Posts” page of an author. 

I made a tool to extract articles from Kinja blogs and only keep the content part of the article (no header/footer/comments/”You might also like”) while saving/replacing any external content that would be fetched from Kinja.

Advertisement

No Javascript from Kinja has been reused, i did very minimal code to be able to display the tags menu and open images on a new tab by clicking on them ; the Youtube, Twitter, Vimeo, Dailymotion, Imgur and Instagram widgets/embedding from Kinja have been replaced by “standard” ones.

It’s saving the article(s) on the web server along with a copy of the images and videos (including the author avatar, blog favicon and thumbnail used on the main page) ; for the images, only the highest resolution is kept.

Advertisement

It is still in beta (please let me know if you find some issue or have ideas), the next step is to do better batch import and if possible recreate the equivalent of the main page : listing posts with the photo, title and name of the author.

How to use - To backup one article

  • Copy/paste either the ID of the article (eg: 1845644279 for this page) or the full URL (including https://) of the article you want archived at the end of this URL : http://jbboin.phpnet.org/oppo/extractor/extractKinja.php?article= (for example : http://jbboin.phpnet.org/oppo/extractor/extractKinja.php?article=https://oppositelock.kinja.com/a-general-handbook-for-posting-on-oppositelock-1293992803)

Advertisement

How to use - To backup articles listed on a page

What has already been extracted is browsable here (it’s simply a DirectoryIndex at the moment)

Advertisement

Known bugs at the moment

  1. Images galleries are not working but the images/videos are saved anyway (you can access to all the files of the article by removing “article.html” from the URL)
  2. The comments are not integrated on the post, it’s not a bug, it’s a feature (for the time being at least) but they are saved in the articleMetadatas.json
  3. Poster avatar can be stretched in some case : At the moment it’s saving the highest resolution available for this image which might not be the one normally used by Kinja FIXED
  4. Tweets are at the moment fixed in height which crops big ones (with video for example) ; Instagram posts have the same issue
  5. I haven’t worked on the embedded Instagram posts (as i haven’t found one) so it will still use the Kinja widget for the time being FIXED
  6. Vimeo embedding is not working (here for example) FIXED
  7. Links on the article to other articles are not modified so they won’t be working anymore once the Kinjapocalypse happened
  8. Instgram posts are (sometimes?) looking a bit... not normal

 

Source code here, if someone is interested.

ps: i did initially put the wrong tool name on the post title... KinjaExtractor instead of extractKinja, sorry for the confusion :(