S3 boto3 'StreamingBody' object has no attribute 'tell'

I was recently trying to work with the python package warcio and feeding an s3 object from the common crawl bucket directly into it.

r = s3.get_object(Key='crawl-data/file....', Bucket='commoncrawl')
for record in ArchiveIterator(r['Body']):

However, this fails with the error:
self.offset = self.fh.tell()
AttributeError: 'StreamingBody' object has no attribute 'tell'

The reason is that boto3 s3 objects don't support tell. It's easily fixable by creating a tiny class:

class S3ObjectWithTell:
    def __init__(self, s3object):
        self.s3object = s3object
        self.offset = 0

    def read(self, amount=None):
        result = self.s3object.read(amount)
        self.offset += len(result)
        return result

    def close(self):

    def tell(self):
        return self.offset

You can now use this class and change
for record in ArchiveIterator(r['Body']):

for record in ArchiveIterator(S3ObjectWithTell(r['Body'])):

Popular posts from this blog

Idea time: RFID+E-Ink, electronic price tags without batteries

Parsing 10TB of Metadata, 26M Domain Names and 1.4M SSL Certs for $10 on AWS

The software that I love