Originally Posted by
Wanted
Bring on the predicted data. I missed Firefly last night because of a gap in TV3 listing.....
Actually, the predicted data would not solve the problem as it is usually implemented as a duplicate of the previous weeks slice, so if there was a gap in that... then there would be a gap in the predicted data. And you would only be using the predicted data if you had not been able to contact the emulator to receive the real data (which may also have gaps in it).
The program that creates the slice needs to be improved so that there are no gaps in the data. There is a chance that the pa rsers/scrapers are the ones at fault, but the other sources of XML-TV data seem to contain the missing information which leads me to believe it is the slice generator which is getting confused.
For example TV1 always has gaps, but the following script will pa rse the TVNZ site fairly well (for TV2 just replace tvone with tv2 in the URL). It does not work out the duration of the episodes, but that can be deduced from the start time of the following show. [The optional parameter to this function is the day (eg thursday)]
Code:
TV1() {
TMP=/tmp/TV_TVNZ_TV1.tmp
if test -n "$1"; then
day="&date=`date -d $1 +%d%m%Y`"
else
day=""
# TVNZ site is a little broken, seems to display yesterdays if
# todays is not available, so hardcode it.
day="&date=`date +%d%m%Y`"
fi
lynx -dump -nolist -width=1024 "http://tvnz.co.nz/view-preempt/tvone_epg_skin/channelDesc=tv_one$day" >$TMP
cat $TMP |\
sed "s,\',',g;s,\&,\&,ig" |\
while read line; do
if test `expr "$line" : '.* [0-9][0-9]:[0-9][0-9]$'` -ne 0; then
title=`echo "$line" | sed 's, \([0-9][0-9]:[0-9][0-9]\)$,,'`
time=`echo "$line" | sed 's,.* \([0-9][0-9]:[0-9][0-9]\)$,\1,'`
hour=`echo "$time" | sed 's,:.*,,'`
read desc
desc="`echo "$desc" | sed 's,[0-9]*$,,'`"
echo "$title|$time|TV1|$desc"
fi
done
}
Bookmarks