-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
Hi!
I found that there is an issue with the WPS image. The software allows images to be directly embedded into cells, and the format is similar to =DISPIMG ("ID5BA4F81A0D674C7AA8849A79AC5645C8", 1)
.

Therefore, it cannot be accessed through worksheets. _images
If we unzip Excel, we can find all the images under xl/media, and the image indexes are in xl/-rels/cellimages.xml.rels and xl/ellimages.xml
This is a unique feature of WPS, at least I haven't found it in Office.
I found a similar implementation
Feature Description
This is my code, which will decompress Excel, read the file, and return an Id to address mapping
def wps_embed_images(file_path, save_path) -> dict:
img_map = {}
with zipfile.ZipFile(file_path, "r") as zip_ref:
zip_ref.extractall(save_path)
id2target = {}
rels = os.path.join(save_path, "xl", "_rels", "cellimages.xml.rels")
tree = ET.parse(rels)
root = tree.getroot()
for child in root:
id2target[child.attrib.get("Id")] = os.path.join(save_path, "xl", child.attrib.get("Target"))
namespaces = {
'etc': 'http://www.wps.cn/officeDocument/2017/etCustomData',
'xdr': 'http://schemas.openxmlformats.org/drawingml/2006/spreadsheetDrawing',
'a': 'http://schemas.openxmlformats.org/drawingml/2006/main',
'r': 'http://schemas.openxmlformats.org/officeDocument/2006/relationships'
}
cellimages = os.path.join(save_path, "xl", "cellimages.xml")
tree = ET.parse(cellimages)
root = tree.getroot()
for cell_image in root.findall('etc:cellImage', namespaces):
c_nv_pr = cell_image.find('.//xdr:cNvPr', namespaces)
image_name = c_nv_pr.get('name') if c_nv_pr is not None else None
blip = cell_image.find('.//a:blip', namespaces)
embed_id = blip.get(f'{{{namespaces["r"]}}}embed') if blip is not None else None
if image_name and embed_id:
img_map[image_name] = id2target[embed_id]
return img_map
Alternative Solutions
We leave it as it is and I continue using the solution shown above.
Additional Context
No response