Home > Projects > IMDB REST Webservice update

IMDB REST Webservice update

Update 2009-12-16

Looking for help in hosting

If you have a server that hosts PHP and you want to support the Scraper service please contact me (info at thumnet dot com).


Update 2009-12-12

Service currently down, due to too many request to IMDb, working on a fix!

Update 2009-12-09

  • Added Picture in imdb-name request
  • Added result count limitation param (see the guide)
  • Some small bugfixes:
    • missing tt and nm prefix in ImdbID property in imdb-name-search request,
    • missing ImdbID for Writers and Directors in imdb-title request,
    • changed Season and Episode in imdb-episode request to SeasonNR and EpisodeNR
    • added result type to the summary element, to identify the result data

Update 2009-12-02

  • Added Name search functionality, with imdb-name-search url param (imdb-search is now imdb-title-search)
  • Added Name details, with imdb-name url param

Update 2009-12-01

  • Fixed the Plot and Tagline for imdb title’s, see comments below.

It’s been some time since my post about the IMDB webservice but I’m proud to tell you readers there is a new version available.

Some important changes include:

  • Restructured output (sorry to you guys who have to update their software)
  • Output available in XML, JSON and debug (other output formats can be added on request)
  • Automatic support for gzipping the output
  • Summary information, containing:
    • data source
    • timestamp of the data
    • time taken in ms
    • scraper info
    • error code (0 for no error!) and error description
  • Easily extendable scrape framework, so in the future more sites can be scraped!
  • Admin interface to review the data you guys produce

Still interested or just curious?

Well the new url is: http://scraper.thumnet.com

The old version (imdb.thumnet.com) will only be available until 1 december 2009.

Categories: Projects Tags: , , , , ,
  1. Jimmy
    November 4th, 2009 at 15:50 | #1

    Hi ,

    The id below is not working same as the old scraper.

    http://scraper.thumnet.com/xml/imdb-title/tt1176724

    Br
    Jimmy

  2. November 5th, 2009 at 09:37 | #2

    @Jimmy
    I’m sorry for that, I didn’t test the XML output.

    The error occures because some of the characters aren’t encoded and others are, I build in a check to test for characters that need encoding and encode them if IMDB didn’t.

    The problem should be fixed now.

  3. Jimmy
    November 5th, 2009 at 13:00 | #3

    Hi , seams to be working but I still have some issues with the encoding of special chars like the swedish ” Å Ä Ö”

    If you compare the old one with the new you see what I mean.
    http://imdb.thumnet.com/xml/title/tt1521870
    and
    http://scraper.thumnet.com/xml/imdb-title/tt1521870

  4. November 5th, 2009 at 13:11 | #4

    @Jimmy
    Dahm, stupid encoding.

    Try it now…

  5. Jimmy
    November 5th, 2009 at 17:16 | #5

    Hi , wow that was a fast response ;)

    but ÅÖÄ is still funky
    http://scraper.thumnet.com/xml/imdb-title/tt1307466

    “Vi hade i alla fall tur med vädret – Igen”

    but it should read
    “Vi hade i alla fall tur med vädret – Igen”

    /Jimmy

  6. November 7th, 2009 at 02:06 | #6

    Hi

    Is this API free to use ? Or should we have to get a license from IMDB ?

  7. November 9th, 2009 at 08:30 | #7

    @Jimmy
    Thnx, :D

    Maybe for these characters you can do the decoding yourself.

  8. November 9th, 2009 at 08:31 | #8

    @Imthiaz
    Hello, officially you would need a license from IMDB.

  9. Tyler
    November 16th, 2009 at 00:50 | #9

    Great service, thanks for your efforts.
    However, recently several data elements, such as Plot and Tagline, are missing from the returned XML. Previously these data elements were present.

  10. November 16th, 2009 at 09:17 | #10

    @Tyler
    Hi Tyler,

    Thnx for you’re comment.

    Elements no longer showing in the output means that IMDB changed their HTML. I’ll look into the problem and make a post here when it’s updated.

    A note to other users, please tell me when parts of the XML aren’t there or contain false data, so I can update the service.

  11. Bara
    November 29th, 2009 at 22:04 | #11

    It would be great if we can see the original (US) release dates of movies in the XML… Any chance of this being included?

    Also, would it be possible for you to open-source your scraper?

    Bara

  12. November 30th, 2009 at 14:15 | #12

    Great service indeed! I hope you get the missing data elements fixed soon – I didn’t even know the Plot and Tagline etc. were supposed to be there until I saw the comments.

    Also – any chance for searching by person name? Like for an actor, director etc. (http://www.imdb.com/find?s=all&q=brad+pitt) ? And then displaying the details for the person?

  13. December 1st, 2009 at 12:58 | #13

    I’m working on a little project that uses your IMDb scraper. Is there some way of contacting you other than these comments? I would like to mention the scraper as a source etc. and possibly encourage users to support your site.

  14. December 1st, 2009 at 14:34 | #14

    @Tyler

    @Markus

    Plot and and tagline are fixed now.

  15. December 1st, 2009 at 14:38 | #15

    @Markus

    Before the end of this week I will release functionality to search people (names) and get information about them using their nm ID.

  16. December 1st, 2009 at 15:11 | #16

    @ThumNet
    Sweet!!

  17. JImmy
    December 1st, 2009 at 16:27 | #17

    Hi again :)

    Solved my decoding problems and everyting is working great now.

    On the functionality side I would love to see support for releses dates and alternative titles as seen here for example

    http://www.imdb.com/title/tt1216487/releaseinfo#akas

  18. December 2nd, 2009 at 16:22 | #18

    Thrilled about the new name search feature! Any chance the might be a ‘Name details’ with a list of relevant movie titles/id option coming up?

    A couple of questions:
    #1 Any idea why searching for ‘heroes’ or ’2012′ don’t really what one would expect?

    Try http://scraper.thumnet.com/xml/imdb-title-search/heroes or http://scraper.thumnet.com/xml/imdb-title-search/2012
    and compare with http://www.imdb.com/find?s=all&q=heroes
    Scrapers XML has popular hits with only node for ‘Picture’

    #2 Similar with name search:
    http://scraper.thumnet.com/xml/imdb-name-search/brad+pitt
    It returns only ‘Popular results’ only one node -> ‘Picture’

    Thanks for all the work!

  19. December 2nd, 2009 at 20:19 | #19

    @Markus

    Fixed the problems you posted.

    Also added name details :D

  20. December 2nd, 2009 at 20:25 | #20

    @ThumNet In the words of director Burns – eeexcellent!

  21. December 7th, 2009 at 08:04 | #21

    Sweet, man. Eager to implement this in the old app I worked on (Hippo).

  22. December 8th, 2009 at 14:13 | #22

    I think I found a couple of things that need fixing/adding.

    #1 Actors/producers etc. in the person details xml are missing the ‘nm’ and ‘tt’ letters from their IMDb id’s -> http://scraper.thumnet.com/xml/imdb-name/nm0000093

    #2 Writers/directors/producers in the title details xml are missing their IMDb id nodes. -> http://scraper.thumnet.com/xml/imdb-title/tt0133093

    #3 Any chance of getting a picture for the person details xml also?

    #4 The episodes feed now has structure ‘Episodes’ > ‘Episode’ > ‘Episode’ – could this be ‘Episodes’ > ‘Episode’ > ‘Episodenumber’ ? I would make the parsing simpler if the node names inside the parent node were unique. Simpler for me that is as I’m not a ‘proper’ coder :)

    #5 A possiblity to define the max length of the hits to limit the size of the XML would be cool

    #6 Also helpful would be to have a node in the XML ‘root’ that simply defines the type of the feed – like resultperson ( ‘resulttitle’, ‘detailperson’, ‘detailtitle’ ).

  23. December 8th, 2009 at 14:26 | #23

    @Markus
    When I have some time to spare I’ll look into them! Most likely tomorrow evening/night.

  24. December 8th, 2009 at 14:36 | #24

    Ever so excellent, thanks!

  25. December 9th, 2009 at 21:04 | #25

    @Markus
    See update above!

  26. Joris Kommeren
    December 17th, 2009 at 08:53 | #26

    Any update maybe on when the website’s fixed?

    Other than that, I’m very very happy with your service, and your episode imdb adds exactly what I felt was missing. Keep up the good work!

  27. December 17th, 2009 at 10:15 | #27

    @Joris Kommeren

    Currently I’m looking for support in running the hosting script.

    So if anyone could help, please contact me. (info at thumnet dot com)

  28. kkr
    March 4th, 2010 at 00:27 | #28

    Hi Thumnet,

    is this legal to use in a freeware app that I create.
    The imdb website says we cant use thier data, but accessing it through your website, how does the legal thing work ?

    thanks
    k

  29. March 4th, 2010 at 10:30 | #29

    How the legal thing works exactly I really don’t know, but maybe someone else can point this out…?

  30. Markus
    March 4th, 2010 at 14:28 | #30

    @kkr
    For sure you cannot ask money for what ever you are doing with the data. In my project I’ve made sure to link to IMDb.com whenever possible and branded it as an UnOfficial IMDb client so as not to claim to be the owner of the data. All in all it will just generate traffic towards IMDb.com and hopefully they will see it as a good thing…

  31. Jimmy
    March 21st, 2010 at 19:05 | #31

    Hi !

    Been using your excelent service since the start now to get metadata for my movie collection and most of the titles works fine bu now and then there is a tiitle that gives me problems.

    For example
    http://scraper.thumnet.com/xml/imdb-title/tt0153922/
    and
    http://scraper.thumnet.com/xml/imdb-title/tt1186830/

    where the tag contains a lot more that just the title, is this someting that could be fixed ?

    Br
    Jimmy

  32. March 22nd, 2010 at 09:07 | #32

    Hi Jimmy, the problem can indeed occur and it is something that will be fixed in the future. But I can’t tell you when the fix will be available. This is because my private live is hectic at the moment.

  1. November 2nd, 2009 at 22:51 | #1
  2. August 4th, 2010 at 10:24 | #2