I have a new phone…
…with a new, interesting feature that combines several of my favourite things: digital image formats and stereoscopy/3D.
Apple has expanded the HEIC image format slightly for mixed reality glasses. Instead of one image, two slightly offset images are stored in a container together with some metadata. In the image gallery, for example, this combination causes an additional icon to appear that can be used to activate the spatial view. The latest iPhones can also take such pictures.
Before we get into the boring technical details, here is an example with a historical stereogram from the VintageReality website. The button for the spatial view can be seen at the top right.
Implementation
The result was achieved through the following steps:
HEIC images
First, the images are extracted from the scans of the cards. The model from the last post is used for this purpose. They are then superimposed on top of each other with almost pixel-perfect accuracy using OpenCV, the resulting edges are removed where possible, and the brightness is adjusted.
The two half-images are then embedded in an HEIC container using the Python module pillow_heif.
XMP metadata
The necessary metadata is encoded in XMP, comparable to old (2024) UltraHDR implementations.
The elements are located in the namespace http://ns.apple.com/image/1.0/ (preferred prefix apple).
| Element | Type | Description |
|---|---|---|
HorizontalFOV |
Real | Horizontal field of view in degrees. |
Baseline |
Real | Stereo baseline (eye distance) in millimetres. |
HorizontalDisparityAdjustment |
Real | Factor for adjusting the horizontal disparity (usually a percentage, e.g. 0.02). |
CameraModelType |
String | The projection model type. Value used: SimplifiedPinhole. |
CameraIntrinsics |
String | Intrinsic camera parameters as a space-separated string: f_pix 0 ppx 0 f_pix ppy 0 0 1. |
CameraExtrinsicsRotation |
String | Rotation matrix as a space-separated string (row-oriented). Value: 1 0 0 0 1 0 0 0 1 (identity). |
CameraExtrinsicsPosition |
String | Position vector in metres as a space-separated string x y z. Value: 0 0 0. |
StereoGroupIndex |
Integer | Index for identifying the stereo group. Value: 1. |
Many of these values are estimated, as exact values for the old cameras used cannot be determined:
| Parameter | Value |
|---|---|
| Horizontal field of view | 45° |
| Interpupillary distance | 65 mm |
| Disparity | 2% |
Theoretically, the disparity could also be determined using OpenCV via alignment, but small deviations are insignificant. The remaining values can be calculated from the estimated values and the size of the input images:
Further information from Apple Developer
Injecting metadata
Finally, the metadata must be injected into the image file. After some configuration (for the namespace and elements), this can be done with Phil Harvey’s exiftool.
Results
In principle, the process works, with the results for indoor shots being significantly better than for outdoor shots. For outdoor shots, however, it can be helpful to compress the spatial staggering of the image planes slightly.
To try it out, these files can be saved on the iPhone in the “Photos” app.