Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Remove FAST_FLOAT, TFLOAT #4385

Open
amitdo opened this issue Jan 24, 2025 · 2 comments
Open

RFC: Remove FAST_FLOAT, TFLOAT #4385

amitdo opened this issue Jan 24, 2025 · 2 comments
Labels

Comments

@amitdo
Copy link
Collaborator

amitdo commented Jan 24, 2025

We should unconditionally use 32-bit float.

@amitdo amitdo added the RFC label Jan 24, 2025
@stweil
Copy link
Member

stweil commented Jan 24, 2025

I already started experiments with even smaller float data types (like they are used in GPUs) because this would accelerate the training. Up to now my experiments were not successful, but who knows, this might change in the future with better compiler support and the right libraries. Therefore knowing the code locations and having a special data type is still very helpful.

But I also don't think that anybody still has the need for the old double implementation.

@amitdo
Copy link
Collaborator Author

amitdo commented Jan 26, 2025

But I also don't think that anybody still has the need for the old double implementation.

My suggestion:

  • In cmake/autotools,FAST_FLOAT should always be defined (remove or comment out the televant config code).
  • Remove all the double (64-bit) dot product code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants