Posts

Showing posts from January, 2018

S3 boto3 'StreamingBody' object has no attribute 'tell'

I was recently trying to work with the python package warcio  and feeding an s3 object from the common crawl bucket directly into it. r = s3.get_object(Key='crawl-data/file....', Bucket='commoncrawl') for record in ArchiveIterator(r['Body']):     pass However, this fails with the error: self.offset = self.fh.tell() AttributeError: 'StreamingBody' object has no attribute 'tell' The reason is that boto3 s3 objects don't support tell . It's easily fixable by creating a tiny class: class S3ObjectWithTell:     def __init__(self, s3object):         self.s3object = s3object         self.offset = 0     def read(self, amount=None):         result = self.s3object.read(amount)         self.offset += len(result)         return result     def close(self):         self.s3object.close()     def tell(self):         return self.offset You can now use this class and change for record in ArchiveItera