Safari Tech Preview 137

By Olivier Anguenot

Published in dev

December 23, 2021

3 min read

Introduction

HTMLModelElement

Other API differences

Other points

Speaker selection

Permissions

Time to go deeper into the new Safari Tech Preview 137

Introduction

STP 137 has landed on Dec’21th just before Christmas. Only 13 days after the version 136, Apple releases a new preview of its browser. I was interesting to do some tests with that version in order to see what changed in term of API and what new APIs can be interesting to learn. This article recaps what I did. So it is a good gift for Christmas ?

HTMLModelElement

Under the Immersive Web group at W3C, we can find a new proposal called HTMLModelElement. This proposal came from Apple. Goal is to create a new DOM element that displays 3D content using a renderer built-in to the browser.

The proposal wants to provide a native manner to directly consume 3D content without having to rely on third-party libraries such as three.js. As explained in a first draft, in some cases, a JavaScript library cannot render content due to security restrictions or to the limitations of the <canvas> which is more dedicated to a flat two-dimensional surface in the web page.

So for Augmented Reality, there is no native element able to perform a display in respect to the CPU. That’s the goal of that new element.

HTMLModelElement is proposed as an experimental features in Safari Tech Preview 137 and so can be used.

Model DOM Element

Here is an example that describes the structure proposal for a <model> element.

<model style="width: 400px; height: 300px" autoplay interactive>
    <source src="assets/example.usdz" type="model/vnd.usdz+zip">
    <source src="assets/example.glb" type="model/gltf-binary">
    <picture>
        <img src="animated-version.gif"/>
    </picture>
</model>

The interactive property allows to manipulate the model without affecting the scrolling position or zoom level of the page. So by doing that, you can rotate the model and see all faces.

Like for the <video> element, the <model> is able to display an image (aka poster) when the content is being loaded or when the content can’t be loaded.

Note: I’m not an expert in 3D elements. I succeeded to display the usdz file and play with. Not the glb.

HTMLModelElement JavaScript APIs

I traced the JavaScript API and here is the result

Accessor	Methods
currentSrc ready	animationCurrentTime() animationDuration() enterFullscreen() getCamera() hasAudio() isLoopingAnimation() isMuted() isPlayingAnimation() pauseAnimation() playAnimation() setAnimationCurrentTime() setCamera() setIsLoopingAnimation() setIsMuted()

This API is completed by the HTMLElement API due to the inheritance (see next paragraph).

Note: I tried to use the API, but I didn’t succeed at that time to do something. Most of the methods are Promise based and didn’t resolve in my attempts.

Inheritance

The difference between an <audio> or <video> element and the <model> element is that they don’t inherit from the same direct parent:

HTMLModelElement --> HTMLElement

HTMLVideoElement --> HTMLMediaElement --> HTMLElement
HTMLAudioElement --> HTMLMediaElement --> HTMLElement

The consequence if I don’t make mistake is that it will perhaps not be possible to stream a model (animation) to a remote peer.

The HTMLModelElement element has no direct relation with WebRTC but as it is an element that can be animated and with an audio part, it can have something in common with WebRTC. At that time of writing, it is only available in Safari as an experimental features. Wait and see!

Other API differences

WebRTC API

I generated my WebRTC Surface API and here is the differences I noticed.

This is a comparison between the previous STP136 and this release.

API	Changes
HTMLVideoElement	+ cancelVideoFrameCallback() +requestVideoFrameCallback()
MediaSession	+ coordinator + playlist readyState

Stats API

Regarding on the Stats API, I saw the following differences

Report	Changes
inbound-rtp Audio	+ number packetsDiscarded
inbound-rtp Video	+ number jitterBufferDelay + number jitterBufferEmittedCount
outbound-rtp Audio	+ number nackCount

Other points

Here is a list of the other points I was interested to test

MediaRecorder

I tried again the MediaRecorder API because this API was not easy to use when you want then to play the result in different browser.

For that test, I decided to develop a little application that takes a mimetype from a list and record a file during some seconds. And then to redo the test with another codec.

Here are the result obtained by testing 4 formats: mp4, webm, ogg, x-matroska and the following codecs: vp8, vp9, h264, h265, av1, avc1, opus, pcm, aac, mpeg, mp4a. Mimetypes are built by taking a format and a codec such as video/mp4;codecs=vp8

Info	Chrome	Firefox	Safari
Mimetype supported	vp8, vp9, h264, avc1 opus, pcm WEBM,MASTROSKA	vp8 opus WEBM	avc1 mp4a MP4 Seems to be H264/AAC
Bit rate	No info	128ko for audio 2,5Mb for video	192Kb for audio 10Mb for video
Video size 10 sec	750Ko (av1) 1,5Mo (vp8)	3,5Mo	12,5Mo
Viewer	Chrome Firefox	Chrome Firefox	Chrome Firefox Safari

Note: audioBitsPerSecond and videoBitsPerSecond were always equals to 0 in Chrome M96.

Note: Still not clear on the separator to use in the mimetype: ’:’ or ’=‘.

So it seems to be still complicated as only video recorded on Safari can be played in all browsers. Video from Chrome and Firefox seems to be still no playable in Safari.

Speaker selection

One of the other feature I would like to try is the experimental feature Allow speaker device selection. It is not something new, but I never tried this feature in Safari.

When activated, calling the function enumerateDevices returns as for Chrome, the available speakers. The groupId attribute allows associating input and output as for Chrome.

Then, in order to use a specific speaker, the experimental feature Allow per media element speaker device selection needs to be activated too.

Once done, you will be able to call the setSinkId method from your <video> or <audio> element.

The only difference seen is that it should be linked to a user gesture to work else you will face the following error:

//Error
{
    name: "NotAllowedError",
    message: "A user gesture is required"
}

This feature works well in my tests. I hope that it will be released officially soon.

Permissions

It has been a while since Permissions API are available in Chrome. I have written an article about How Permissions APIs help managing the authorizations?.

For testing the Permissions API in Safari, you need to activate the experimental feature Permissions API.

Once done, you can use the following code to get the status of the camera and the microphone

try {
  const microphonePermission = await navigator.permissions.query({name:'microphone'});
  if (microphonePermission.state === "denied") {
    // Do something in case of the permission has been declined
  }

  microphonePermission.onchange = () => {
    // Do something when the permission changed
  }
}
catch(err) {
  // Error treatment
}

What is strange is that even if the permission has been accepted by the user, the state is still equals to prompt. I never received the onchange event.

Unfortunately, basic use case tested seems not to be complete. We need to rely on the event change to react to user actions. Hope to be well-supported soon.