PDA

View Full Version : guide data/tv schedule website idea



timmy
15-05-2004, 11:31 AM
hmm here's a crazy idea, how about we make a tv schedule website??

...rambling

I say this because I think we need to seriously rethink how we work with guide data. Even with sat extraction the guide data isn't that great, ie missing alot of season info, movie info etc, the kind of information that could be extracted from IMDB and TVTome for example.

Now we can automate that into the sat grab but you've still got a couple of problems, eg reliability + legality. No offense to Jaidev because he's pretty committed (except this week of all weeks, doh!) but even still, having some basic guide data to fall back on would be awesome.

Considering that most of what I (u 2?) record is season stuff or movies then it's probably not that difficult. I've been thinking of a website whereby people could (visually) see a guide data schedule (much akin to sky's site although ideally alot better) and make changes to it (password protected/moderated, etc). eg incorporating season information, genre's, etc.

The biggest critique here is that the user base is to0 small,however I disagree, (1) you can extrapolate alot of information from past data, eg season programs and (2) you can gather (layer?) the information from existing sources eg saturn, tvnz, sky, etc. and (3) you can offer the format in xmltv as well to encourage mythtv,freevo users to participate.

...dreaming

It'd be real nice to have layers of data on top of time blocks. eg tv1 6:00-7:00. so for example u could (optionally) have saturn's data as one layer, sky's data as another, user entered data as another, etc. the top layer (which would be the data to be exported) would be a combination of these layers based on various rules and conditions. ie do we have this data from this source, has it been verified, is this the last episode in the season, etc etc.

The idea situation would be to have it entirely user driven, that would be legal then (a point to argue with some of you here I am sure!!).

I know Bruce Simpson (aardvark.co.nz) is thinking something along these lines for an xml tv schedule (probably alot simpler in concept tho), he has mentioned it briefly on his site but knowing him he'll be months or years before he gets it complete.

just ideas, constructive criticism and ideas please ;)

hh75
15-05-2004, 12:19 PM
Tim,

I think this is a great idea and I'd be more than willing to help out - e.g. keying or uploading listing data (I'm just into Channels 1,2 & 3 - no sky) from Listener for example. Presumably a slice would be autogenerated from the site snapshot at regular and defined intervals ?

Great idea

Rgds Halfdan

brucer
15-05-2004, 12:46 PM
Timmy, I think that's a great idea, I write (among other things) asp.net web applications for a living so maybe I can help, like you though I don't have a lot of free time.. Another thought I had was the possibility of a "Tivo Club" with a small annual sub to fund some basic things like a permanent website and emulator (eg nztivo.org).. just a random thought ;)

Bruce

timmy
15-05-2004, 03:51 PM
> Presumably a slice would be autogenerated from the site snapshot at regular and defined intervals ?

The site would most likely be seperate from the tivo emulator and slice maker which would be autogenerating your guide slices. Makes no difference if it is networked and live does it? mmm webservice ;)

Seperating it from Tivo probably isn't a bad idea either. Think about why we are using second hand Tivo's in the first place dude.

> I write (among other things) asp.net web applications for a living so maybe I can help,

Mate, if only I knew what to code. Actually I haven't had a whole lot of success with ASP.NET to date. Nor much experience with dotNet either to tell truth. Personally for the kind of office app development I do MS Access is much more suitable as the development time is much faster. (rant coming on...) I hate Access for example with the way it uses absolute positioning for forms but it's still fast and you can store the tables on another database (ie SQL Server) if u need to centralise or scale up. Maybe you can code faster in .NET if you know what your doing (I don't yet) but it seems like you have to do alot of the database/UI grunt work yourself. Databinding is one thing I really appreciate with Access :)

anyway... I reckon building database driven webservices in .NET is simple enough I just have reservations about using ASP.NET for building a full fledged (ideally visual) interface. I'm not sure how to handle time-series data and display it in a way that is going to be easy (dream... slick~~) and intuative for users to work with.

...dreams

A full fledged 3d interface would be the ****. I had a look at SVG (w3.org) the other day but it's only (primarily) 2d based. You could probably do 2d quite nice with it tho but it doesn't seem that well suited for interactivity.

just the possiblity for showing alot more information at once by using 3d is really worth thinking about. I thought this SVG Project (http://heml.mta.ca/heml-cocoon/) was kind of interesting. It's 2d but it's time based so could be adapted for our purposes.

... somewhere closer to reality

probably Flash would be better suited but that isn't an area of my expertise either. Even a pretty standard/simple interface like sky ala tables or DOM/CSS for example would suffice, showing one or two time periods per channel. Not a huge fan on sky's 2hr time period tho, great for finding something on right now as you can read some of the programme info at that view on the same page (eg times/titles in sky's case). For us tho we would be better off showing like a week or two at a time. Unfortunately even for one channel thats quite alot of data so you can't show (much if any) text at that level. Just very small color coded time blocks ;)

The color codes for example could represent various types of programme info or completeness of programme info. Mousing over a time blocks could provide a more detailed view (eg individual record view) and a click on a block could show the actual data for editing (again possibly in some way still in view with the two week data). Also some linear navigation and the ability for full edit view and ideally showing past data for that time block so you could not only enter new data but pull thru old data by defining rules and conditions. The linear mode would be useful for those entering in one at a time from the Listener err memory ;)

thoughts?

timmy
15-05-2004, 04:55 PM
more rambles..

also handling users activities is a whole other issue (presumably u will need to moderate users and contributions to some extent). hehe, god it's starting to sound like a mini content mangement system almost :)

anyone heard of anyone else stupid enough to do this or are we just complete suckers in this country? Anyone seen any nice tv schedule interfaces online?

alright, time to do something non computer related. drink beer :D

cya's

Brandoo
15-05-2004, 08:51 PM
I'm all for the idea - the majority of the stuff I do is MySQL and PHP based.

As some of you may already know, I'm already getting listing data from a couple of sources. This data is being stored in a MySQL database.

If there are a few ppl keen then let me know and I can arrange access to the database.

Only problem I have at the moment is the ability to create a slice with the data :)

timmy
15-05-2004, 09:18 PM
likewise i have a database for my listings. The problem isn't so much a centralised public database as it is the ability to update and keep track of the data in the database. For that you need some sort of interface. Just having a database accessible doesn't mean much if you can't interact with it easily, ie via a web browser for example.

A simple layout like nzoom/tvnz for example might even suffice, ie show one day (eg monday) and one channel per page with vertical scrolling. You can put two or three weeks (ie three mondays) side by side to get a comparison with the previous or future data.

Dunno, the reason I thought it'd be good to show a couple of weeks at a time is so you would'nt lose site of the woods from the trees so to speak.

...now back to drinking :eek:

zollymonsta
15-05-2004, 09:25 PM
I can help with keying in data... :) no good with programming though :rolleyes:

samjohnson
16-05-2004, 03:14 PM
If you are looking at using .Net to create an application then maybe you should consider the WebZinc (www.webzinc.net) .net component - it makes it quite simple to extract stuff from webpages and it is quite easy to modify the code when the design or layout of the webpages change.

You can also create an XML based script for extracting the data which can be modified for a site change without having to modify the actual application that uses it.

deanm
19-05-2004, 10:37 AM
nice idea.. the simpler the better for a start I reckon..
along the lines of a wiki.. with user voted 'trust' levels

the guide data from the 'other local' website seems quite credible..
Is there a reason we're not using it? :confused: It would seem to be a good basis to start the wiki on, as all thats missing really is TV1/2 data.

I grabbed 7 days worth of data and opened it in TV Guide For Windows (http://tvguide.sourceforge.net/)
seems to work a treat.. a few holes, but pretty darn good. :)

timmy
19-05-2004, 11:01 AM
yes simple me thinks ;)
i like to dream what can i say?

by 'other local' website i presume u mean sky?

the genre's aren't very suitable, limited description info. could supplement with other websites, etc. one benefit of having a tvschedule website is if these sites break, which sky seems to quite a lot, we can at least have extrapolated data rather than nothing at all (and or) also to have a single site to act as a sort of clearing house for guide data isn't a bad thing, i know trying to support grabbers for tvnz,sky,saturn, etc, and now bbc is a bit of a pain, but if we can have multiple people experimenting with grabbers and stuff that would be great ;)

deanm
19-05-2004, 11:29 AM
was actually thinking of the nerdthings.com grabber :p
[apologies if we're not supposed to mention other sites :o]

I realise the sky data leaves a bit to be desired.. but as the base data for the suggested site.. perfect.

timmy
19-05-2004, 11:33 AM
i thought that grabber was only pulling saturn?

deanm
19-05-2004, 11:34 AM
there's an option for Sky or Saturn.

I haven't actually compared the two 'tho.

Tony
19-05-2004, 11:38 AM
Guys,

Fantastic idea!

I've been wondering about something on similar lines, though it sruck me what a good idea it might be if schedule data could be anonymously submitted by anyone, thereby creating a beautiful legal firewall by creating strong "plausible deniability" about the source of the information... no one person would be creating it, and it wouldn't come from any one source.

The guys building the slices and running the emulator wouldn't need to have any idea at all where it came from, in fact. They could safely believe someone was typing it all out of "The Listener"... maybe someone will be!

How about creating a structure whereby multiple anonymous submissions can be made, and some form of AI is used to extact and merge the best data from various submissions?

It'd become possible then to download data from the site, tweek it, and upload modifications. This way, individual community members could build "modules" on their own systems to do things like IMDB lookups, built on whatever platform they're comfortable with, then share the improved data with everyone else.

And of course, since there are multiple data sources there would be good redundancy when a particular source falls over for whatever reason.

Cheers,

Tony

Brandoo
19-05-2004, 05:47 PM
I've started writing something that I think does what Timmy is talking about.... will see how it turns out :)

number6
30-05-2004, 08:29 AM
Guys,

Fantastic idea!

I've been wondering about something on similar lines, though it sruck me what a good idea it might be if schedule data could be anonymously submitted by anyone, thereby creating a beautiful legal firewall by creating strong "plausible deniability" about the source of the information... no one person would be creating it, and it wouldn't come from any one source.

The guys building the slices and running the emulator wouldn't need to have any idea at all where it came from, in fact. They could safely believe someone was typing it all out of "The Listener"... maybe someone will be!

How about creating a structure whereby multiple anonymous submissions can be made, and some form of AI is used to extact and merge the best data from various submissions?

It'd become possible then to download data from the site, tweek it, and upload modifications. This way, individual community members could build "modules" on their own systems to do things like IMDB lookups, built on whatever platform they're comfortable with, then share the improved data with everyone else.

And of course, since there are multiple data sources there would be good redundancy when a particular source falls over for whatever reason.

Cheers,

Tony

I have no problems with what you are proposing, but I am sure that the copyright holders to the program data you "host" on that website would have problems with it.

Just because the guide data is published in Skywatch or in the TV Guide or Listener or in the local paper etc doesn't mean that information is public domain with no copyright attached to it.

If you look carefully in the TV Guide, Skywatch, bottom of the TV listings in your local paper or on TVNZ's websites you will see that the networks take steps to proclaim their copyright [admittedly its usually in small print somewhere, but its there none the less, just like it is in a book].

Sky have previously stated that they consider their Program Guide data to be part of their "crown jewels" of Intellectual property [IP] TVNZ likewise has a similar position.

So, the short answer, if you set up such a site, expect a court order forcing you to shut it down sooner than later.

Sure, you could host the whole thing offshore - but first you'll have to find some country to host your website/server that doesn't honour copyright agreements and China is about the one country in the Internet connected world that used to ignore copyright rules [well when it was someone elses IP that is], but since they joined the WTO, you can bet thats gonna change.

timmy
30-05-2004, 08:41 AM
I have setup hosting in the US. I do not consider the fact that a programme is showing at a certain time and on a certain channel to be copyright information. Bruce Simpson (aardvark.co.nz) also shares this belief which is good enough for me.

Although my current plans are to pull in data from existing (website) sources which is definately a problem but if the site had a reasonably large community I believe it could operate very successfully with pure user input. That is my belief but whether that happens or if indeed I feel its more useful to use sat extraction as a source I'll look into different hosting solutions.

Information should be free after all... this is not piracy, I am not stealing anything. But yes, Sky could potentially bring about a lawsuit if they wanted to. We'll see in time I guess...

number6
30-05-2004, 09:49 AM
I have setup hosting in the US. I do not consider the fact that a programme is showing at a certain time and on a certain channel to be copyright information. Bruce Simpson (aardvark.co.nz) also shares this belief which is good enough for me.


That may be so, but I don't see any TV listings on aardvark (any more), or if they are there they are well hidden. I know he used to have TV listings some time ago but obviously the economics of the situation have got to him.

In any case in This story (http://www.aardvark.co.nz/daily/2002/0423.shtml) Bruce Simpson is complaining about his copyright being infringed by Xtra caching pages to his site.

Bruce can't have it both ways. Either everything on the web is copyright in any form, or its not.



Although my current plans are to pull in data from existing (website) sources which is definately a problem but if the site had a reasonably large community I believe it could operate very successfully with pure user input. That is my belief but whether that happens or if indeed I feel its more useful to use sat extraction as a source I'll look into different hosting solutions.

Information should be free after all... this is not piracy, I am not stealing anything. But yes, Sky could potentially bring about a lawsuit if they wanted to. We'll see in time I guess...

Technically saying that Buffy is on at 9PM on TV2 next Friday is not breaking copyright.

However providing a full listing including episode summary/plot for others to download and use - probably will - especially if the listing is in fact a "100%" copy of the same entry from the TV networks published schedules.

Yes, you could try and argue, what if each person keys in one episode for one program a week, then individually they are all merely making "non-infringing" copies (in legal terms) of a small portion of the TV networks listings in the same way as photocopying 1 page of a book is allowed in some circumstances [or sampling a few seconds of a longer piece of music is also allowed] by the copyright rules.

But in that case, the collective effect is to overcome the copyright that the TV networks hold over their listings by "one thousand" minor (non-infringing) copies.
This exact scenario is covered by many legal cases in copyright law and precendents and is not allowed.

If that wasn't the case we could all get together and collectively steal any copyright material we like by simply making as individuals many small copies of parts of the larger work then joining them up into a "new work" and making that available.

In any case, the "fair use" rights that we may have, generally do not permit the combining and storage of such listings in a form beyond that they were originally published in (e.g. webpage, printed magazine) without the permission of the copyright holder.

While what you and your Tivos (or me and my Tivos) get up to with the Sky webpage data is between you and your Tivos.

Doing the same for all and sundry to use with copyright TV listings is really a much different league (legally).

Don't confuse the fact that you "give this information away for free" either as a let off - legally this is no defence and is no different than what p2p services like Napster did, except with music.

While the many TV episode guide sites that exist in the US can argue for protection under the 1st ammendment [right to free speech], that doesn't necessarily extend to websites hosted in the US by non-US citizens.

Maybe you need to get Ed (Hintz) to host your website in the US, then you'd have your bases covered more than you do now.

timmy
30-05-2004, 10:12 AM
hmm, well if tvnz and sky actually had decent guide data and if I was directly copying that then yes I can see their point. However, their guide data isn't much more than the basic *factual* information. eg programme name, episode, time, duration.

I hardly see how using that information (not harvested from their sites) and building it up with information from tvtome, imdb, etc and user input is really infringing on their IP?

------------------

"Technically saying that Buffy is on at 9PM on TV2 next Friday is not breaking copyright."

That is the key information. If that isn't copyright (which realistically is just silly if it is) then you can use that and build up your own guide data from other (free) sources.

"However providing a full listing including episode summary/plot for others to download and use - probably will - especially if the listing is in fact a "100%" copy of the same entry from the TV networks published schedules."

In our case it ain't.

----------------

"Maybe you need to get Ed (Hintz) to host your website in the US"

umm they are in the US (both my homepage (still missing my domain, sigh) and the new one), the hosting is cheaper than NZ :)

---------------
ok another way to look at it..

channel/day -> programmetime -> programme -> episode

The first two pieces are factual (eg channel/day -> programmetime) and the last two are copyright by whoever writes/publishes that information. Therefore if people publish information on my site (assuming it is written by themselves) then the guide data will be owned (copyrighted/lefted whatever) by the community. :p

gadgetman
01-06-2004, 05:17 AM
I like the way this site lays out data:

http://home.nzcity.co.nz/tvnow/tvguide.asp?c=w

The idea of sticking to half hour time slots on the left, then placing a time next to the program name if the time differs (such as 9:05am).

Just my 2c worth....

timmy
01-06-2004, 05:35 AM
Definining set widths (ie time intervals) limits you on how much information you can display. I initially liked the idea as it gives you a sense of programme duration but in hindsite i'm not so keen. Actually its kind of irrelevant, i've developing this tvschedule site as a set of coldfusion components (persistant objects), essentially accessed as webservices. the frontend could take shape and form in any number of ways, as a pure webservice, webpages, or flash.

hmm work on the tvschedules website will slow now that sat ext is nearly complete and that I have to help Jaidev by continuing with the data parser for it (essentially done tho).

gadgetman
01-06-2004, 08:51 AM
ok then, if the front end is not well defined, how about moving away from a web based idea and back to a rich client app, such as vb/c++/c# etc, there is MUCH more functionality using COM components (or whatever else) and the slow page refreshes associated with a website are removed. Of course the app would still need to pull its data from some internet location...

What do you think?

timmy
01-06-2004, 09:00 AM
ahh well my frontend is well defined but considering all of the functionality will be available as web services there is nothing stopping you from implementing a desktop front end for it.

actually this whole project is in question because it would raise the profile of satelite extraction because i'd like to use that as a source. although i'm thinking i dont want to use the descriptions because that really is taking it too far, and anyway i think the community and other web sources can supply better descriptions anyway

brucer
01-06-2004, 11:06 AM
it's looking to me that we're going to end up with everyone having to run their own emulator and grabbers (via proxy to avoid IP logging), if sky is actually serious about the copyright issue then whole idea of a shared "public" slice is doomed period, regardess of where the data originates.

Perhaps the better (and legal) community effort at this point would be to make it easy for newbies to setup an emulator, grabber and slice generator on their own PC.

Just thinking out loud.. ;)

ehintz
01-06-2004, 11:29 AM
it's looking to me that we're going to end up with everyone having to run their own emulator and grabbers (via proxy to avoid IP logging), if sky is actually serious about the copyright issue then whole idea of a shared "public" slice is doomed period, regardess of where the data originates.

Perhaps the better (and legal) community effort at this point would be to make it easy for newbies to setup an emulator, grabber and slice generator on their own PC.

Just thinking out loud.. ;)IANL. But I can comment that the emulator is hosted in the US, which might complicate things for Sky.

The "private" emulator is fairly popular with the Canucks, but their data sources don't seem to be as hostile. The constant rewrites for Sky makes that a dodgy method.

If copyright stuff becomes an issue I suspect P2P will be the solution. Esp. if we can seed the slice offshore.

brucer
01-06-2004, 11:37 AM
The constant rewrites for Sky makes that a dodgy method

Not really, if we have an "open source" grabber there are more than enough programmers here to keep on top of it without overburdening anyone..

ehintz
01-06-2004, 12:01 PM
Not really, if we have an "open source" grabber there are more than enough programmers here to keep on top of it without overburdening anyone..I guess it comes down to the laws; I don't know what they are here. Back in the States they would probably put some crap encoding in and then make it a DMCA violation (circumvention device), in which case most programmers would not want to play whack-a-mole (as the mole) with Sky. I kinda think that if Sky's gonna get ugly, they'll get ugly on an OS project and the members just as quick as if it's just 1-2 folks getting data. Maybe if it was hosted on sourceforge and sufficiently annonymized or something it would work. But again I'm still pretty ignorant of NZ laws so I'd be happy to defer to someone with an actual clue in that department... ;)

brucer
01-06-2004, 12:27 PM
I'd assume that assembling your own personal slice for a service for which are a paying subscriber would fall under "fair use"? (of course I'm not a lawyer either).. the only way I can think that they could permanently defeat grabbers would be to present all program information as images rather than text but they'd have to be _really_ upset about it to go that far surely, I also think it would be hardly worth their while to constantly change their website just to punish a very small group of people who are, after all, customers anyway..it just wouldn't make financial sense to spend all that time on constant web development and would eventually lead to much grumbling from the vast majority of "legimate" users..
while we're on the subject..has there actually been any indication from Sky that they even care? Even when they eventually launch their own pvr service they're not likely to lose much market share to diy tivo enthusiasts..most people, myself included, would prefer to have a reliable pay service than muck around in the linuxy bowels of a hacked tivo ;)

number6
01-06-2004, 01:09 PM
I guess it comes down to the laws; I don't know what they are here. Back in the States they would probably put some crap encoding in and then make it a DMCA violation (circumvention device), in which case most programmers would not want to play whack-a-mole (as the mole) with Sky. I kinda think that if Sky's gonna get ugly, they'll get ugly on an OS project and the members just as quick as if it's just 1-2 folks getting data. Maybe if it was hosted on sourceforge and sufficiently annonymized or something it would work. But again I'm still pretty ignorant of NZ laws so I'd be happy to defer to someone with an actual clue in that department... ;)

We don't have a DMCA (yet), but the existing copyright laws are hopelessly out of date (although a digital age update to the Copyright Act is coming this year we're told).

Because of the lack of recent updated laws, the existing precedents are used by the courts to determine copyright issues and I think you'll find that the rights in this area are pretty much all in Skys camp.

For instance, Sky could apply for a "ex-parte" (without prior notice) "interim injunction" against Tim, you me and everyone else who hangs out at his website, or who has a Tivo prohibiting us from touching their program listing info in anyway.

Then we could argue it court, but that would take 3-7 years as its a "civil matter" and/or it could cost us megadollars to try and get the injunction overturned. You could well be living back in the US before that got decided in court!

Secondly, they could apply [in extreme circumstances admittedly, but it has happened] for a "Anton Pillar order" which is a Search & Seisure order from the court, and this is carried out by the court in conjunction with their lawyers, not the Police [which is even more worrying actually].

Such an order could permit them (amoung other things) to seize every Tivo, computer and anything else possibly related to them you have in your house, your ISP or whatever - whether or not it had any Sky Program Guide data on it. Having all this hosted outside the US is good first step to protect against this sort of thing, but it only goes so far as at the end of the day you and your Tivos will be in NZ and thats the bits that can be subjected to NZ Copyright Law.

Basically, the guts of this are - let sleeping dogs lie - i.e. don't piss them off to such an extent that they (over) react against you.

brucer
01-06-2004, 01:31 PM
Basically, the guts of this are - let sleeping dogs lie - i.e. don't piss them off to such an extent that they (over) react against you.

Wow.. thanks for the info (I think :rolleyes: )

I guess having every Tivo owner thrashing their website with grabbers would fall into the "pissing them off" basket.. hmm, from now on I just use my Tivo to pause live tv :p

number6
01-06-2004, 01:54 PM
while we're on the subject..has there actually been any indication from Sky that they even care? Even when they eventually launch their own pvr service they're not likely to lose much market share to diy tivo enthusiasts..most people, myself included, would prefer to have a reliable pay service than muck around in the linuxy bowels of a hacked tivo ;)

Yes Sky have said that they regard EPG information (i.e. their Program Guide data) as proprietary to Sky and part of the their (Intellectual) Property.

For this reason they refuse to release it to anyone without payment and retain the ownership of the program data in all cases anyway, so you can't do a deal with Sky then sell it to all and sundry without their say so.
They even turned down Microsofts offer to get access to Skys EPG info for their competing product [Xbox TV or whatever it was called].

One guy a few years back did a EPG licensing deal with Sky (before Skys website had listings on it & before Microsoft came-a-calling) to create a "TV scheduling" service which allowed you to bookmark programs (via his website) you wanted to record and the site would email you reminders of stuff to record. He had EPG info from Sky, TVNZ, TV3 and TAB etc but eventually folded the business because he couldn't make money from it.

In his application the website retained the EPG info at all times and the EPG info wasn't onsold to anyone or used in devices like a Tivo.

There were few Tivos in NZ at that stage and in any case, his concept was to try and make money from reselling the Guide data if he could. Basically he was just ahead of the times. Shame really, we could have used him about now... - mght have cost each about $10 per month subscription though :-)

The other point is that yes Sky may well not be too bothered about a bunch of DIYers repackaging their EPG info for "home use" for now.

However, eventually [and sooner than later if the EPG info is good and consistent] this arrangement will become abused as the overseas (and local) eBay/Trademe operators start trying to flog off cheaply sourced US Tivos for hundreds of dollars and saying "all support and Guide data for your Tivo is free from XXXX" - which is what is happening in AUS right now and to be honest there is little they can do about it except lock down the emulator to only dish out guide data to "known tivos" to block the hangers on.

Then suddenly, Skys find that their (precious) guide data is being used to prop up a commercial enterprise [some guy flogging of US Tivos on ebay to NZers], and they will react accordingly - and it will be by going after everyone, not just the abusers.

And you'll also have TV3 and TVNZ breathing heavily in Skys direction since their (precious) EPG data is now also being stolen/used without their permission too (as a result of Skys data being flogged), and Sky get a lot of benefits financial and otherwise from having TVNZ and TV3 on their Pay-TV platform (not least it keeps the government from regulating the entire pay-tv industry here and also keeps TVNZ and TV3 from pushing ahead with Digital Terrestrial TV a competiing platform to Skys Satellite - and Sky have just signed up for 15 more years to Satellite on a (yet to be launched) Optus D1 sat. so they are committed to Sat Pay-TV for now).

So you can see that Sky has a lot to lose and not a lot to gain from letting this sort of thing happen, even in a de-facto kind of way.

In any case, if you disagree, with all this, try ringing up Sky and see if you can get their permission to use their EPG data on your Tivo.
We already tried this approach without much luck some time ago and I doubt it will be any better now.

timmy
01-06-2004, 02:46 PM
Ok the legalities of these things are all very interesting and stuff but it still leaves us in a conundrum. I personally think trying to grab from a website is a waste of time. The data isn't very feature full anyway and the problems of trying to maintain a working grabber creates a reliance on programming folk to put in their time to maintain it. I have thought about distributed and proxied grabbing but flagged it... i'm now pretty firm on community built tv listings, which I believe is technically legal.

So my question is, do I continue with my tvschedule site and if so do I use existing online or satelite listings?

If I proceed with the site it is going to raise the profile of our activities, no doubt about it. If I use existing sources (ie web/satelite) of data then it is highly questionable legally so it's a double whammy. If I don't and the guide data is still coming from web/satellite then is the community going to put their backing behind it? If they don't we're left with a tvschedule site that isn't going to be useful and possible increased profile of other guide data grabbing activities.

thoughts people...

brucer
01-06-2004, 03:49 PM
how does all this work in Australia? Obviously they've got a much higher profile.. have they been harassed? Is the law different over there or do they have some kind of licensing agreement?

timmy
01-06-2004, 04:10 PM
hmm another 2c worth

yeah they are grabbing from websites, their grabbers break, they just keep fixing them, awhile back one site decided they didn't want oztivo grabbing from them so put a stop to it (or something happened anyway), i think they had to source an alternative site, basically they have agreed to not discuss where the guide data comes from and leave it at that. Personally i'm surprised that they manage to get such good guide data from online sources, obviously they're better at coding than me ;)

a while back someone wanted to start a business off the back of oztivo, the community was pretty anti as u can imagine. There is another tivo enthusiast currently trying to create his own PVR service (icetv.com.au?), don't think anyone even knows how he's going about the guide data but he is offering it as a pay service same as tivo so dunno how he is skirting the law...

number6
01-06-2004, 04:15 PM
how does all this work in Australia? Obviously they've got a much higher profile.. have they been harassed? Is the law different over there or do they have some kind of licensing agreement?

They are just like us, living in the fear of the coming of the great (free) Guide Data Killing/Publicity monster.

Since there are many more TV channels spread all over Aus and each state seems have their own laws that can change or remove Federal (National) laws, the legal position is pretty much muddier than here especially since by doing stuff across state lines you can circumvent a lot of legal issues.

However, they are finding the recent profile rise due to newspaper articles and rise of the "trying to make a buck of the OzTivo group" operators on Ebay that things may start hotting up there soon.

They do have an advantage in that there are some (Pay to use [ebroadcast]) EPG services in Aus, and in theory if enough Tivo owning folks reached into their wallets each month they could buy a legal EPG feed but the entry fee for that is a little steep (about 10K per month I think). Also, there seems to be a PVR service launching soon in Aus which may also have a positive spin-off benefit to the Tivo guys.

None of this is much use to us though - but they are in the same boat as us more or less, just a little closer to the front of the boat than we are - so that when the onrushing waves hit (us &) them, they may get tipped out first, but we'll no doubt follow suit.

I guess as soon as a proper PVR service hits these shores, the floodgates may finally open up with full EPG data becoming more available, but right now, no-one wants to be the first mover on this so we end up without legal EPG data.

We may have to wait until one of the Aus EPG service operators finally decides to take on NZ as well.

Also, don't forget that its one thing to have EPG data, but it must be up to date and accurate always to be any use.
I guess you've all seen what its like to have semi-ok Guide data - its basically unusable and is more trouble than its worth!

The Aus guys have all found that the existing TV EPG sites they use have lots of problems with bad data - and so a lot of their effort is spent trying to fix dodgy data, let alone trying to obtain reliable clean sources for those hard to get channels. Then you have to align all the data so that the Genres from Source A are consistent with all the other sources they use.
(Curling anyone?)

Really, considering it costs only $20 per month per-Tivo for the US Tivos to have local, accurate guide data, they get a really good deal considering all the hard work that Tivo do keeping the data up to date and accurate.

gadgetman
01-06-2004, 05:13 PM
This is aimed more at the others who have developed grabbers. Reading comments here and on other sites about how the grabbers have been written and the fact they break often, I feel I have developed a solution that works differently. I know the whole grabber issue is supposed to be superceeded soon, but in the mean time I thought I should share what I did.

I have a VB6 app with two IE controls, the first loads with the main schedule page from Sky for a given date and time, the DOM of this object is then exaimed and any links to program data are then loaded as a page in the second IE control.

This removes all problems with parsing HTML, and I get all the info about each program, like Genre, date, time, description etc. The only problem is it is slower than parsing pure HTML, one grab of a 3 hour timeslot for all channels takes about 5 minutes, depending on how fast Skys site is. Currently I save this in an Access database (free to any who want a look) but that was just cause I couldn't be bothered at the time creating a SQL database.

This may be a stupid question but would an approach like this give enought raw data for the website idea being promoted on another thread? I have already started expanding my app to pull data for movies from IMDB, more for personal reference than anything.

As for Sky restricting/cutting access to their site, my 2c is that ANY change Sky makes costs money to them. I read an interesting acticle about how there systems all tie together, something like 7 different types, and it sounds VERY complexed (too much so) so I would think they would only stop scrapping as a last resort. Of course legal action is another issue, but they have to prove who got the data! How many people here know about the free anonymous Internet access you can get using GPRS on Vodafone prepay?

Whew! This whole comment has become too long.

timmy
01-06-2004, 05:49 PM
hmm, using the DOM is an interesting aproach. nice..
Seriously tho, they can make small changes to those pages very easily, they are after all just coldfusion pages. I know because they have made a number of minor but obvious changes to the html tags and it caused havoc for me until i rewrote my regular expressions to make them more flexible..

The other big issue was reliability with the site, network timeouts, etc..
hmm, encoding to images would be their ultimate f'off to us. Would'nt take much effort on their part and they only need do it randomly on some of the programme info.

hmm, don't fool yourself into thinking it is hard for them to change their site. I don't believe that is the case unless they are even more stupid than I think. I know they have a variety of systems to pull the data thru but the frontend is probably completely seperate and hence able to be modified as much as they want. Presumeably the data is just getting pushed into a database that is accessible to the (external) webserver, thats pretty standard... and from a security standpoint it would'nt be ten foot from the real database(s). In saying that i'm not sure tho because the site was designed by DNZ rather than SKY, they provide a CMS (Content management system) for the site tho whether the listings are a part of that is anybody's guess. who knows who cares...

anyway, i'm thinking to continue development on the tvschedules site and not use any existing guide data sources. That is my feeling right now. I'll gear it up more for general tv watchers (ie non pvr people) as well as support for tivo and xmltv folk. we'll see if it catches any interest... it is good web programming practice for me anyway, not that i dont do enough already. ugh

brucer
01-06-2004, 05:53 PM
FWIW, leaving aside all the grabber issues for a second it seems to me we have a couple of realistic choices:

1.we ignore the legal risk and go ahead with a community tvschedule website and emulator and hope nobody notices.

or

2.we all setup our own emulators (or use LOADGUIDE) and roll our own slices and/or share slices via some anonymous P2P service. I'm sure we could safely provide public "for entertainment purposes only" information about setting up an emulator/using LOADGUIDE/using wktivoguide and maybe locating slices on P2P. Beyond that we limit public discussion to technical Tivo issues.

Option 1 is easiest for the non-technical (and therefore open to abuse by commerical interests) but also apparently exposes people like Timmy and Ed to possible legal action.

Option 2 is less convenient but workable, lower profile and much harder to set the lawyers onto.

gadgetman
01-06-2004, 06:02 PM
Might be a stupid question, but has anyone asked TVNZ/Sky/etc if there schedule data is available, even for a fee? It must be viable (at some level) for the likes of TV Guide, Listener etc.

timmy
01-06-2004, 06:05 PM
hmm, another problem with the sky site is that some channels are updated less frequently than others.
For example tv3,c4,newsaus,prime, etc. These channels only seem to have a weeks worth of data yet the
other channels have upto 3 weeks. FYI the site seems to get updated around the first or second week of the month.

So basically you have to regrab more frequently yet the structure of the pages makes this very download intensive as each programme info is on a seperate page and you can't easily just scrape for a select channel. Even if you can pull out the programme ID's for that channel, ie matching up the table tags and pulling out all of the programme id's for that channel you still have the problem of hitting those pages, eg in two hr intervals thats 12 page requests per day. 7x12=84 pages just to figure out what programme id's you want... annoying. easily doable of course...

I know those programme ID's are in sequence but buggered if i wanted to figure out how the sequences were being allocated.

oh and yes we (well not me personally) have asked sky and they have been less than helpful. and those that do get pay for their listings cannot share them of course... that would be too easy my friend ;)

gadgetman
01-06-2004, 06:20 PM
If only sky would setup an FTP server they wouldn't have all our scrappers hitting them, there site would be faster, we would get all the data we need, and we would all live happily ever after!!! :D

timmy
01-06-2004, 07:00 PM
well it was very nearly like that actually. they had an xml file available for download for a while, unfortunately it wasn't meant to be public so they password protected it...

www.sky.co.nz/data

welcome to run a password list over it if u want :rolleyes: who knows might get lucky

deanm
03-06-2004, 11:32 AM
does anyone subscribe to Sky watch.. would that be considered as paying for Sky EPG? It would seem a reasonable argument.. :confused:

Now say I take that magazine I paid for.. and employ a data entry clerk to key the data into a database for my personal use.. nothing fancy.. just time.. date.. program title.. (which would be pretty damn cheap and quick IMHO and in terms of copyright would consitute 'Fair Use' of data i've paid for) and my personal database adds to the 'base' data from an online source like IMDB or TVTome.. or even a *cough* NZ based data source ;) am i breaking the law yet? :)

OK.. i might be wrong here (its been know to happen :p ), but say individual TiVO users.. all 'armed' with Skywatch subscriptions form a 'hobbyist embedded linux device club'.. and pool resources.. yeah.. yeah.. its a tad simplistic, but you get where i'm going with it. right?

This doesn't address free-to-air data.. or TiVO users without SKY.
It would also mean access to the emulator and 'club' data would only be via paid membership.. even though the club is a non-profit organisation.. there are the basic running costs and the data entry clerk to consider.

While old school.. and VERY low tech.. its also low profile.
No IP to track from grabbing.. just a bunch of geek hobbyists doing there thing.. and feeding a data entry clerk. ehehe

thomson
03-06-2004, 11:52 AM
and in terms of copyright would consitute 'Fair Use' of data i've paid for)
The problem here is that we do not have the American 'Fair Use' system... we use the United Kingdom 'Fair Dealing' which is a little different and does not really include reasonable use... Essentially, in this country, if it is copyright then you can not copy it. The (reasonable use) loopholes that appear in the 'Fair Use' system are not openly available to us.

deanm
03-06-2004, 01:39 PM
:o bugger!

OK.. while it is a flawed plan.. it would seem to be a lot less 'trauma' than our current situation.

gadgetman
03-06-2004, 02:05 PM
Whilst I now think the idea of a website to enter data is a good way forward (it gives information such as seasons that are unavilable elsewhere), if people want to persure the scrapping idea, there is an easy way to do it totally anonymously using GPRS on Vodafone. Plus scrapping COULD be used as a basis for the website programming.

Because Vodafone become an ISP when using GPRS and prepay involves no disclosure of personal information, you can connect to the 'net without ANY identifying information being available, and then to can scrape away, the IP address is dynmanic as well so unless Sky block the whole address range Vodafone use, it is an idea going forward.

number6
03-06-2004, 04:01 PM
Whilst I now think the idea of a website to enter data is a good way forward (it gives information such as seasons that are unavilable elsewhere), if people want to persure the scrapping idea, there is an easy way to do it totally anonymously using GPRS on Vodafone. Plus scrapping COULD be used as a basis for the website programming.

Because Vodafone become an ISP when using GPRS and prepay involves no disclosure of personal information, you can connect to the 'net without ANY identifying information being available, and then to can scrape away, the IP address is dynmanic as well so unless Sky block the whole address range Vodafone use, it is an idea going forward.

I don't think any one here should be that worried about being tracked by Sky/TVNZ etc when downloading the EPG info from their website, after all its why its up there in the first place - to be accessed.

The real problem is that Sky and others can change their website anytime they like for all sorts of reasons [not just to do with blocking downloaders] and whenever this happens it goodnight nurse to the current web-site scraper.

So then someone has to spend hours working out why its broken then fixing it, until that website or another one changes.

Yes, they can also block accessors by IP address, but in the days of ADSL and Dynamic IP addresses its no more of a issue for Sky to block all Xtra IP addresses than it is to block all of Vodaphones if they choose.

No-one said that owning a Tivo without a paid for reliable service was going to be easy - this is "ragged edge" stuff here.

As far as using Vodaphones GPRS service to download websites, the cost of that would soon mount up I'm sure beyond paying someone to sit down and key the TV listings from your Skywatch into your Tivo - and all without the hassle of needing to have web-site scrapers in the first place.
You'd have to repeat it for TV1,2, 3 and Prime as well, and if you live in Wellington, do the same for Saturn TV.

And at the end of the day its to no more legit to do that than the web-site scraping approach.

The long term solution is to have some decent competition out there in the Pay TV market.

The best longer term solution is to get Sky or whatever launching a Tivo-like service of their own as this will need (reliable & constantly fed) EPG data to work and then you can make use of that stream somehow to feed your Tivos guide data habbit.

Right now the whole guide data thing is quite limited by the TV Networks refusal to make Guide data available for more than about 7 days ahead.

The Tivo works best with at least 10 days or more of Guide data to work with.

Skys EPG (on their decoder) seldom goes more than 7 days ahead anyway (it used to go a whole 10-14 days when it started).
TVNZ is not much better either.

brucer
03-06-2004, 04:51 PM
Vodafone GPRS is slow and obscenely, revoltingly, disgustingly expensive at $10 per MB for casual use (I use it sparingly sometimes with my Treo PDA), there are hundreds of free anonymous proxy servers out there if you're worried about IP logging (although scraping is dramatically slower via most proxies)..


The Tivo works best with at least 10 days or more of Guide data to work with.

Why is that?


The best longer term solution is to get Sky or whatever launching a Tivo-like service of their own

Of course it will still be illegal but the difference will be they would actually have a good reason to crack down to protect their own PVR service.

It's all very depressing :(

brucer
03-06-2004, 04:54 PM
btw, with all the "criminal intent" expressed in this thread it might pay to delete it once we're done :D

gadgetman
03-06-2004, 06:59 PM
Thought I would be boring and ask Timmy how the site is going? Is there any help of a technical nature you need, or perhaps some automated testing?

And in regards to your quesion a few days ago about if you should use web data to seed your site, I think the anser should be yes. If the responsbility is placed onto people, there will always be a time when it doesn't get done. Whilst we can get the EPG data automatically I think we should.

Cheers!

timmy
03-06-2004, 07:26 PM
hey dude,

been kinda flatout lately so haven't done much at all this week tho i'll have some time to work on it this weekend. msn or icq me as i have a few questions for you actually.

have coded i'd say 30-40% of core functionality, excluding security and logging (these are just general (class) cross cutting concerns anyway). hmm, doesn't sound like much but a good start as the design (database schema & object model) took a fair amount of time to devise.