图像文件格式的优缺点

图像文件格式的优缺点#

在处理显微镜图像数据时，存在许多文件格式。大多数显微镜供应商都有专有的图像文件格式，图像分析软件供应商提供自定义和部分开放的文件格式。传统文件格式也存在，并得到常见Python库的支持。对于通用图像存储，只有少数图像文件格式如TIF可以被推荐。此外，应避免使用JPEG等格式以保持图像数据的完整性。在本笔记本中，我们将通过将显微镜图像重新保存为这些格式，然后重新加载图像并比较保存前后的图像来测试一些文件格式。

另请参阅

from skimage.data import cells3d
from skimage.io import imread, imsave
import pyclesperanto_prototype as cle
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import os
import warnings

作为示例图像，我们使用scikit-image提供的Cells 3D示例图像，该图像由Allen细胞科学研究所提供。它是一个无符号16位整数图像。在第一次尝试中，我们将只选择单个通道和平面。

original_image = cells3d()[30, 1]
cle.imshow(original_image, colorbar=True)

../_images/2fd4f3cb5d1db35ea587fe3a672439cf57b8bdd1642ce9c43f80e1bf1cfb17c2.png

为什么应该避免使用JPEG#

JPEG文件格式，至少在许多程序的默认设置下，是一种有损文件格式。这意味着在保存图像时会丢失信息。

imsave("temp.jpg", original_image)

Lossy conversion from uint16 to uint8. Losing 8 bits of resolution. Convert image to uint8 prior to saving to suppress this warning.

上面的警告证实了这一点。此外，当我们重新加载图像时，我们可以在颜色条中看到强度范围现在不同了。

jpg_image = imread("temp.jpg")

cle.imshow(jpg_image, colorbar=True)

../_images/c5432ef7b7511eb2722b8833f37ef829db454a67be29e67f421fb2d13b5be18e.png

此外，如果我们放大，我们可以看到两种典型的JPEG伪影。

图像被去噪。
我们看到补丁，例如8x8像素大小，以及朝不同方向的条纹。

fix, axs = plt.subplots(1,2,figsize=(10,10))

cle.imshow(original_image[140:170, 0:30], plot=axs[0])
cle.imshow(jpg_image[140:170, 0:30], plot=axs[1])

../_images/3a0ebffa4b1ae47385b05995c077b42b82de69c1cb33e089e90544f911e08366.png

为了定量研究这些错误，并测试多种文件格式，我们编写一个简短的函数，该函数保存并重新加载图像，并输出一个包含一些测量结果的表格，

def resave_image_statistics(original_image):
    """以多种格式保存并重新加载图像，并返回包含图像统计信息的表格。"""

    # 准备表格
    stats = {
        "ending":[],
        "data_type":[],
        "shape":[],
        "size":[],
        "min":[],
        "max":[],
        "mean":[],
        "standard_deviation":[],
        "mean_squared_error":[],
    }
    
    # 遍历不同的文件扩展名
    endings = [None, "tif","png","mhd","mha","jpg","gif","bmp"]
    for ending in endings:
        try:
            if ending is None:
                # 使用原始图像作为第一个测试
                reloaded_image = original_image
                size = np.NaN
            else:
                # 保存并重新加载图像
                filename = "temp." + ending
                imsave(filename, original_image)

                reloaded_image = imread(filename)
                size = os.path.getsize(filename)
            
            # 确定统计信息
            stats["ending"].append(ending)
            stats["data_type"].append(reloaded_image.dtype)
            stats["shape"].append(reloaded_image.shape)
            stats["size"].append(size)
            stats["min"].append(np.min(reloaded_image))
            stats["max"].append(np.max(reloaded_image.astype(float)))
            stats["mean"].append(np.mean(reloaded_image))
            stats["standard_deviation"].append(np.std(reloaded_image))

            stats["mean_squared_error"].append(np.mean((original_image - reloaded_image)**2))

        except ValueError:
            # 某些格式不受支持
            warnings.warn("不支持的文件格式: " + ending)
            pass

    return pd.DataFrame(stats)

接下来，我们将该函数应用于2D图像切片。

resave_image_statistics(cells3d()[30, 1])

Lossy conversion from uint16 to uint8. Losing 8 bits of resolution. Convert image to uint8 prior to saving to suppress this warning.
Lossy conversion from uint16 to uint8. Losing 8 bits of resolution. Convert image to uint8 prior to saving to suppress this warning.
Lossy conversion from uint16 to uint8. Losing 8 bits of resolution. Convert image to uint8 prior to saving to suppress this warning.

	ending	data_type	shape	size	min	max	mean	standard_deviation	mean_squared_error
0	None	uint16	(256, 256)	NaN	1091	58327.0	9346.115417	6139.322637	0.000000
1	tif	uint16	(256, 256)	131328.0	1091	58327.0	9346.115417	6139.322637	0.000000
2	png	uint16	(256, 256)	107470.0	1091	58327.0	9346.115417	6139.322637	0.000000
3	mhd	uint16	(256, 256)	281.0	1091	58327.0	9346.115417	6139.322637	0.000000
4	mha	uint16	(256, 256)	131350.0	1091	58327.0	9346.115417	6139.322637	0.000000
5	jpg	uint8	(256, 256)	9271.0	0	237.0	36.005875	23.824527	32596.059555
6	gif	uint8	(256, 256)	58225.0	4	227.0	36.014725	23.978329	32236.271164
7	bmp	uint8	(256, 256)	66614.0	4	227.0	36.014725	23.978329	32236.271164

保存3D图像#

我们可以使用相同的函数来测试哪些文件格式支持3D图像。

resave_image_statistics(cells3d()[30])

File format not supported: png
File format not supported: mhd
File format not supported: mha
File format not supported: jpg
File format not supported: gif
File format not supported: bmp

	ending	data_type	shape	size	min	max	mean	standard_deviation	mean_squared_error
0	None	uint16	(2, 256, 256)	NaN	277	58327.0	5925.968422	5711.584119	0.0
1	tif	uint16	(2, 256, 256)	262566.0	277	58327.0	5925.968422	5711.584119	0.0

保存4D图像#

cells3d数据集实际上是一个4D数据集。它还包含通道。因此，我们也可以用它来测试保存和重新加载4D数据。

resave_image_statistics(cells3d())

File format not supported: png
File format not supported: mhd
File format not supported: mha
File format not supported: jpg
File format not supported: gif
File format not supported: bmp

	ending	data_type	shape	size	min	max	mean	standard_deviation	mean_squared_error
0	None	uint16	(60, 2, 256, 256)	NaN	0	65535.0	4837.14054	3985.348828	0.0
1	tif	uint16	(60, 2, 256, 256)	15748650.0	0	65535.0	4837.14054	3985.348828	0.0

因此，可以推荐TIF文件格式。它也与其他软件如ImageJ和Fiji兼容。

练习#

确定’ico’和’raw’文件格式是否受支持。

图像文件格式的优缺点

Contents

图像文件格式的优缺点#

为什么应该避免使用JPEG#

保存3D图像#

保存4D图像#

练习#