Download images linked by URL?

Hello dear community,

I have integrated images from external sources into my database via URL (e.g., Google Photos). Is there a way for the linked images to be permanently embedded in the database? If I’m not mistaken, only the images on the external pages are called up, but not stored in the database itself. However, this would be very helpful for me because I manage an extensive collection where the images should be preserved even if the originals were deleted from the external sources.

Thank you very much for your help, orannge

Hi @orannge, and welcome to the SeaTable forum!

There is no trivial built-in feature to achieve your goal, however scripting might help you! If you’re not familiar with scripting, please consult the SeaTable help about the subject. Here is a script example. In order to use it, you’ll have to create a new script, copy/paste the code below in it and run it from the script editor. Before using it, don’t forget to backup your data! Indeed, I wrote this script based on an example base of mine, but it may not behaves as expected on yours, so backup will allow you to easily recover any eventual data loss.
Now, few variables you’ll have to change so that the script will run in your base:

  • TABLE_NAME: the name of the table containing your images
  • IMAGES_COLUMN_NAME: the name of the column containing your images
  • BACKUP_IMAGES_COLUMN_NAME: the name of a backup image copy you have to create. Each image defined by an external URL and that the script was not able to retrieve/update will be moved to this column. For these one unfortunately, you’ll probably have to manually download them and reimport them from your device. Let’s hope there aren’t too many
  • ROW_IDENTIFIER_COLUMN_NAME: the name of column that will be used to help you identify the rows with some images processing errors
from PIL import Image
from io import BytesIO
import os
import requests
from seatable_api import Base, context
from urllib.parse import urlparse
from pathlib import Path

### Replace the names between single quotes with those of your actual table/columns
TABLE_NAME = 'User'
IMAGES_COLUMN_NAME = 'Images'
BACKUP_IMAGES_COLUMN_NAME= 'Bkp'
ROW_IDENTIFIER_COLUMN_NAME = 'Name'


base = Base(context.api_token, context.server_url)
base.auth()
rows = base.list_rows(TABLE_NAME)
rows_data = []
bkps_data = []
imgs_cpt = 0
imgs_success = 0
for row in rows:
  imgs = row[IMAGES_COLUMN_NAME]
  errors = []
  if imgs:
    uploaded_images = imgs
    backup_images = []
    for img in imgs:
      if isinstance(img, str) and not img.startswith(context.server_url+'/workspace/'+str(base.workspace_id)+'/asset/'+base.dtable_uuid+'/images') and not img.startswith('data:') and not img.startswith('custom-asset://'):
        imgs_cpt += 1
        uploaded_images.remove(img)
        filename = os.path.basename(urlparse(img).path)
        filename_without_extension = Path(filename).stem
        file_extension = Path(filename).suffix
        try:
          response = requests.get(img)
          response.raise_for_status() 
          if response.status_code in range(200,300) :
            content_type = response.headers.get('content-type', '')
            if 'application/json' not in content_type and 'text/' not in content_type:
              if not file_extension:
                im = Image.open(BytesIO(response.content))
                file_extension = '.'+im.format
              info_dict = base.upload_bytes_file(filename_without_extension+file_extension, response.content, file_type='image')
              uploaded_images.append(info_dict['url'])
              imgs_success += 1
            else:
              errors.append(f'Unknown error while getting file {img}')
              backup_images.append(img)
        except requests.exceptions.HTTPError as e:
          errors.append(f'{str(e.response.status_code)} error while getting file {img}')
          backup_images.append(img)
        except Exception as e:
          errors.append(f'{str(type(e))} error with file {img}')
          backup_images.append(img)
    row_data = {'row_id': row['_id'], 'row': {IMAGES_COLUMN_NAME: uploaded_images}}
    if backup_images:
      row_data['row'][BACKUP_IMAGES_COLUMN_NAME] = backup_images
    rows_data.append(row_data)
    if errors:
      print(f'====== row "{row[ROW_IDENTIFIER_COLUMN_NAME]}" ({row['_id']}) ======')
      for err in errors:
        print(err)
      print(f'=====================================================')

row_update = base.batch_update_rows(TABLE_NAME,rows_data)
if row_update['success']:
  print(f'Successfuly replaced {imgs_success}/{imgs_cpt} URL-based images')
else:
  print('Error while trying to link to uploaded images to the corresponding rows')

Hope this will solve your problem…

Bests,
Benjamin

Hi bha,

thank you so much for replying. I don’t understand much of what you wrote and will have to look into it when I have more time. I will come back then with a reply. Anyhow, this is awesome data to work with. :hugs:

Merry christmas!

1 Like