Peter's Blog 7.7.2004

2004-07-07

PyDSS Gentoo problem revisited

Georg Bauer commented on my hack P376 yesterday to say:

"your hack has disabled conditional GET completely, so you now pull down feeds every hour fully. This might get you some angry comments if somebody finds out ;-)"

I decided to investigate further. The problem boils down to this code in DownstreamTool.open_http:

def open_http(self, url, data=None):
    
numheaders = len(self.addheaders)
    
self.isHTTP = 1
    
self.lastURL = self.getTheUrl(url)
    
try:
        
theurl = self.getTheUrl(url)
        
self.message = _('opening url: <a href="%s">%s</a>') % (theurl, theurl)
        
if not(self.force):
            
for h in self.cache._getUrlHeaders(theurl):
                
apply(self.addheader, h)
                
self.message += _('<br>adding Header "%s: %s"') % h
        
urlpieces = urlparse.urlparse(url[1])
        
url = (urlpieces[1], url[1])
        
res = urllib.URLopener.open_http(self, url, data)
        
self.message = self.message.replace('%', '%%')
        
<snip>

It turns out that this code is DIFFERENT to the same code on my old PyDS install although BOTH are supposed to be 0.7.2 (they both say it at the top of the screen). The two lines:

urlpieces = urlparse.urlparse(url[1])
url = (urlpieces[1], url[1])

have been added in the Gentoo version. The problem seems to be that the variable url may be either a simple string or a tuple and the new code is assuming that it is a tuple when it is actually a string. I don't know precisely what the new code is supposed to be doing but taking it out fixes the problem.

Georg mentioned that I am repeatedly downloading articles and I did notice this effect but I was seeing duplicate entries before (particularly from the BBC) so I was accustomed to skipping past them.

Still, why is the PyDS version 0.7.2 in Gentoo different from my old version of 0.7.2 (which came from Georg's debian package)???

posted at 17:17:04    #    comment []    trackback []
July 2004
MoTuWeThFrSaSu
    1 2 3 4
5 6 7 8 91011
12131415161718
19202122232425
262728293031 
Jun
2004
 Aug
2004

A blog documenting Peter's dabblings with Python, Gentoo Linux and any other cool toys he comes across.

XML-Image Letterimage

© 2004, Peter Wilkinson

Bisi and me