Nothing Special   »   [go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch float16 and float128 with h5grove as binary using dtype=safe #1561

Merged
merged 1 commit into from
Feb 8, 2024

Conversation

axelboc
Copy link
Contributor
@axelboc axelboc commented Feb 1, 2024

No description provided.

@axelboc axelboc changed the title Fetch float16 and float128 as binary using dtype=safe Fetch float16 and float128 with h5grove as binary using dtype=safe Feb 1, 2024
0,
1,
2,
3,
4,
null,
Infinity,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maximum finite float128 value gets converted to float64 Infinity. I would have expected Numpy to clamp it to the maximum finite float64 value instead. 🤔

Now that I think of it, a similar thing happens with the smallest float128 value greater than 0 (the value of the float128_scalar dataset in the sample file) => h5grove sends 0 in float64 — not the smallest float64 value greater than 0, as I would have expected.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maximum finite float128 value gets converted to float64 Infinity

Sounds like a safe behaviour. It would be weird to convert a finite value to another while you know it's not the same value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be weird to convert a finite value to another while you know it's not the same value.

That's exactly what happens when you cast an int64 to int32, though. Obviously inf cannot be represented in int32, but still, in float, it's a special value... The rounding to 0 makes more sense in that regard.

I'm just asking in case the code in h5grove is wrong somehow — maybe the float128 are cast by Python before numpy can convert them properly, or something of the sort. 🤷

@loichuder
Copy link
Member

Sorry, took a while to get back to this and I would need some context to refresh my mind.

Why using dtype=safe makes it possible to fetch float16 and float128 as binary ?

@axelboc
Copy link
Contributor Author
axelboc commented Feb 8, 2024

Sorry, took a while to get back to this and I would need some context to refresh my mind.

Why using dtype=safe makes it possible to fetch float16 and float128 as binary ?

No worries!

There's no Float16Array and Float128Array in JS, so if we fetched float16/128 datasets with format=bin without dtype=safe, we'd get binary we don't understand (at least without using custom libraries like https://github.com/petamoriken/float16 or https://github.com/munrocket/double.js).

When fetching float16/128 datasets with format=bin and dtype=safe, h5grove converts them to float32/64 respectively. With this knowledge, we can adjust our provider logic to say that when fetching float16, the response buffer should be passed into a Float32Array and when fetching float128, into a Float64Array. (Before, we were saying that float16/128 datasets had no corresponding typed array, which led to fetching them as JSON).

packages/app/src/providers/h5grove/utils.ts Outdated Show resolved Hide resolved
@axelboc axelboc merged commit a09481f into main Feb 8, 2024
8 checks passed
@axelboc axelboc deleted the safe branch February 8, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants