固定小数点演算を行う最良の方法は何ですか？

Question

ニンテンドーのプログラムを高速化する必要がありますDSこれはFPUを持たないため、浮動小数点演算（エミュレートされ、低速です）を固定小数点に変更する必要があります。

私が始めたのは、フロートを整数に変更し、それらを変換する必要があるときはいつでもx >> 8を使用して固定小数点変数xを実際の数に変換し、x << 8固定小数点に変換します。すぐに、変換する必要があるものを追跡することは不可能であることがわかりました。また、数値の精度を変更することは難しいこともわかりました（この場合は8）。

私の質問は、どのようにこれをより簡単かつ迅速に行うべきですか？ FixedPointクラスを作成するか、FixedPoint8のtypedefを作成するか、いくつかの関数/マクロを使用して構造体を作成して変換する必要がありますか？変数名に何かを入れて固定小数点であることを示す必要がありますか？

Evan Teran · Answer

私の固定小数点クラスを試すことができます（最新の@ https://github.com/eteran/cpp-utilities ）

// From: https://github.com/eteran/cpp-utilities/edit/master/Fixed.h // See also: http://stackoverflow.com/questions/79677/whats-the-best-way-to-do-fixed-point-math /* * The MIT License (MIT) * * Copyright (c) 2015 Evan Teran * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef FIXED_H_ #define FIXED_H_ #include <ostream> #include <exception> #include <cstddef> // for size_t #include <cstdint> #include <type_traits> #include <boost/operators.hpp> namespace numeric { template <size_t I, size_t F> class Fixed; namespace detail { // helper templates to make magic with types :) // these allow us to determine resonable types from // a desired size, they also let us infer the next largest type // from a type which is Nice for the division op template <size_t T> struct type_from_size { static const bool is_specialized = false; typedef void value_type; }; #if defined(__GNUC__) && defined(__x86_64__) template <> struct type_from_size<128> { static const bool is_specialized = true; static const size_t size = 128; typedef __int128 value_type; typedef unsigned __int128 unsigned_type; typedef __int128 signed_type; typedef type_from_size<256> next_size; }; #endif template <> struct type_from_size<64> { static const bool is_specialized = true; static const size_t size = 64; typedef int64_t value_type; typedef uint64_t unsigned_type; typedef int64_t signed_type; typedef type_from_size<128> next_size; }; template <> struct type_from_size<32> { static const bool is_specialized = true; static const size_t size = 32; typedef int32_t value_type; typedef uint32_t unsigned_type; typedef int32_t signed_type; typedef type_from_size<64> next_size; }; template <> struct type_from_size<16> { static const bool is_specialized = true; static const size_t size = 16; typedef int16_t value_type; typedef uint16_t unsigned_type; typedef int16_t signed_type; typedef type_from_size<32> next_size; }; template <> struct type_from_size<8> { static const bool is_specialized = true; static const size_t size = 8; typedef int8_t value_type; typedef uint8_t unsigned_type; typedef int8_t signed_type; typedef type_from_size<16> next_size; }; // this is to assist in adding support for non-native base // types (for adding big-int support), this should be fine // unless your bit-int class doesn't nicely support casting template <class B, class N> B next_to_base(const N& rhs) { return static_cast<B>(rhs); } struct divide_by_zero : std::exception { }; template <size_t I, size_t F> Fixed<I,F> divide(const Fixed<I,F> &numerator, const Fixed<I,F> &denominator, Fixed<I,F> &remainder, typename std::enable_if<type_from_size<I+F>::next_size::is_specialized>::type* = 0) { typedef typename Fixed<I,F>::next_type next_type; typedef typename Fixed<I,F>::base_type base_type; static const size_t fractional_bits = Fixed<I,F>::fractional_bits; next_type t(numerator.to_raw()); t <<= fractional_bits; Fixed<I,F> quotient; quotient = Fixed<I,F>::from_base(next_to_base<base_type>(t / denominator.to_raw())); remainder = Fixed<I,F>::from_base(next_to_base<base_type>(t % denominator.to_raw())); return quotient; } template <size_t I, size_t F> Fixed<I,F> divide(Fixed<I,F> numerator, Fixed<I,F> denominator, Fixed<I,F> &remainder, typename std::enable_if<!type_from_size<I+F>::next_size::is_specialized>::type* = 0) { // NOTE(eteran): division is broken for large types :-( // especially when dealing with negative quantities typedef typename Fixed<I,F>::base_type base_type; typedef typename Fixed<I,F>::unsigned_type unsigned_type; static const int bits = Fixed<I,F>::total_bits; if(denominator == 0) { throw divide_by_zero(); } else { int sign = 0; Fixed<I,F> quotient; if(numerator < 0) { sign ^= 1; numerator = -numerator; } if(denominator < 0) { sign ^= 1; denominator = -denominator; } base_type n = numerator.to_raw(); base_type d = denominator.to_raw(); base_type x = 1; base_type answer = 0; // egyptian division algorithm while((n >= d) && (((d >> (bits - 1)) & 1) == 0)) { x <<= 1; d <<= 1; } while(x != 0) { if(n >= d) { n -= d; answer += x; } x >>= 1; d >>= 1; } unsigned_type l1 = n; unsigned_type l2 = denominator.to_raw(); // calculate the lower bits (needs to be unsigned) // unfortunately for many fractions this overflows the type still :-/ const unsigned_type lo = (static_cast<unsigned_type>(n) << F) / denominator.to_raw(); quotient = Fixed<I,F>::from_base((answer << F) | lo); remainder = n; if(sign) { quotient = -quotient; } return quotient; } } // this is the usual implementation of multiplication template <size_t I, size_t F> void multiply(const Fixed<I,F> &lhs, const Fixed<I,F> &rhs, Fixed<I,F> &result, typename std::enable_if<type_from_size<I+F>::next_size::is_specialized>::type* = 0) { typedef typename Fixed<I,F>::next_type next_type; typedef typename Fixed<I,F>::base_type base_type; static const size_t fractional_bits = Fixed<I,F>::fractional_bits; next_type t(static_cast<next_type>(lhs.to_raw()) * static_cast<next_type>(rhs.to_raw())); t >>= fractional_bits; result = Fixed<I,F>::from_base(next_to_base<base_type>(t)); } // this is the fall back version we use when we don't have a next size // it is slightly slower, but is more robust since it doesn't // require and upgraded type template <size_t I, size_t F> void multiply(const Fixed<I,F> &lhs, const Fixed<I,F> &rhs, Fixed<I,F> &result, typename std::enable_if<!type_from_size<I+F>::next_size::is_specialized>::type* = 0) { typedef typename Fixed<I,F>::base_type base_type; static const size_t fractional_bits = Fixed<I,F>::fractional_bits; static const size_t integer_mask = Fixed<I,F>::integer_mask; static const size_t fractional_mask = Fixed<I,F>::fractional_mask; // more costly but doesn't need a larger type const base_type a_hi = (lhs.to_raw() & integer_mask) >> fractional_bits; const base_type b_hi = (rhs.to_raw() & integer_mask) >> fractional_bits; const base_type a_lo = (lhs.to_raw() & fractional_mask); const base_type b_lo = (rhs.to_raw() & fractional_mask); const base_type x1 = a_hi * b_hi; const base_type x2 = a_hi * b_lo; const base_type x3 = a_lo * b_hi; const base_type x4 = a_lo * b_lo; result = Fixed<I,F>::from_base((x1 << fractional_bits) + (x3 + x2) + (x4 >> fractional_bits)); } } /* * inheriting from boost::operators enables us to be a drop in replacement for base types * without having to specify all the different versions of operators manually */ template <size_t I, size_t F> class Fixed : boost::operators<Fixed<I,F>> { static_assert(detail::type_from_size<I + F>::is_specialized, "invalid combination of sizes"); public: static const size_t fractional_bits = F; static const size_t integer_bits = I; static const size_t total_bits = I + F; typedef detail::type_from_size<total_bits> base_type_info; typedef typename base_type_info::value_type base_type; typedef typename base_type_info::next_size::value_type next_type; typedef typename base_type_info::unsigned_type unsigned_type; public: static const size_t base_size = base_type_info::size; static const base_type fractional_mask = ~((~base_type(0)) << fractional_bits); static const base_type integer_mask = ~fractional_mask; public: static const base_type one = base_type(1) << fractional_bits; public: // constructors Fixed() : data_(0) { } Fixed(long n) : data_(base_type(n) << fractional_bits) { // TODO(eteran): assert in range! } Fixed(unsigned long n) : data_(base_type(n) << fractional_bits) { // TODO(eteran): assert in range! } Fixed(int n) : data_(base_type(n) << fractional_bits) { // TODO(eteran): assert in range! } Fixed(unsigned int n) : data_(base_type(n) << fractional_bits) { // TODO(eteran): assert in range! } Fixed(float n) : data_(static_cast<base_type>(n * one)) { // TODO(eteran): assert in range! } Fixed(double n) : data_(static_cast<base_type>(n * one)) { // TODO(eteran): assert in range! } Fixed(const Fixed &o) : data_(o.data_) { } Fixed& operator=(const Fixed &o) { data_ = o.data_; return *this; } private: // this makes it simpler to create a fixed point object from // a native type without scaling // use "Fixed::from_base" in order to perform this. struct NoScale {}; Fixed(base_type n, const NoScale &) : data_(n) { } public: static Fixed from_base(base_type n) { return Fixed(n, NoScale()); } public: // comparison operators bool operator==(const Fixed &o) const { return data_ == o.data_; } bool operator<(const Fixed &o) const { return data_ < o.data_; } public: // unary operators bool operator!() const { return !data_; } Fixed operator~() const { Fixed t(*this); t.data_ = ~t.data_; return t; } Fixed operator-() const { Fixed t(*this); t.data_ = -t.data_; return t; } Fixed operator+() const { return *this; } Fixed& operator++() { data_ += one; return *this; } Fixed& operator--() { data_ -= one; return *this; } public: // basic math operators Fixed& operator+=(const Fixed &n) { data_ += n.data_; return *this; } Fixed& operator-=(const Fixed &n) { data_ -= n.data_; return *this; } Fixed& operator&=(const Fixed &n) { data_ &= n.data_; return *this; } Fixed& operator|=(const Fixed &n) { data_ |= n.data_; return *this; } Fixed& operator^=(const Fixed &n) { data_ ^= n.data_; return *this; } Fixed& operator*=(const Fixed &n) { detail::multiply(*this, n, *this); return *this; } Fixed& operator/=(const Fixed &n) { Fixed temp; *this = detail::divide(*this, n, temp); return *this; } Fixed& operator>>=(const Fixed &n) { data_ >>= n.to_int(); return *this; } Fixed& operator<<=(const Fixed &n) { data_ <<= n.to_int(); return *this; } public: // conversion to basic types int to_int() const { return (data_ & integer_mask) >> fractional_bits; } unsigned int to_uint() const { return (data_ & integer_mask) >> fractional_bits; } float to_float() const { return static_cast<float>(data_) / Fixed::one; } double to_double() const { return static_cast<double>(data_) / Fixed::one; } base_type to_raw() const { return data_; } public: void swap(Fixed &rhs) { using std::swap; swap(data_, rhs.data_); } public: base_type data_; }; // if we have the same fractional portion, but differing integer portions, we trivially upgrade the smaller type template <size_t I1, size_t I2, size_t F> typename std::conditional<I1 >= I2, Fixed<I1,F>, Fixed<I2,F>>::type operator+(const Fixed<I1,F> &lhs, const Fixed<I2,F> &rhs) { typedef typename std::conditional< I1 >= I2, Fixed<I1,F>, Fixed<I2,F> >::type T; const T l = T::from_base(lhs.to_raw()); const T r = T::from_base(rhs.to_raw()); return l + r; } template <size_t I1, size_t I2, size_t F> typename std::conditional<I1 >= I2, Fixed<I1,F>, Fixed<I2,F>>::type operator-(const Fixed<I1,F> &lhs, const Fixed<I2,F> &rhs) { typedef typename std::conditional< I1 >= I2, Fixed<I1,F>, Fixed<I2,F> >::type T; const T l = T::from_base(lhs.to_raw()); const T r = T::from_base(rhs.to_raw()); return l - r; } template <size_t I1, size_t I2, size_t F> typename std::conditional<I1 >= I2, Fixed<I1,F>, Fixed<I2,F>>::type operator*(const Fixed<I1,F> &lhs, const Fixed<I2,F> &rhs) { typedef typename std::conditional< I1 >= I2, Fixed<I1,F>, Fixed<I2,F> >::type T; const T l = T::from_base(lhs.to_raw()); const T r = T::from_base(rhs.to_raw()); return l * r; } template <size_t I1, size_t I2, size_t F> typename std::conditional<I1 >= I2, Fixed<I1,F>, Fixed<I2,F>>::type operator/(const Fixed<I1,F> &lhs, const Fixed<I2,F> &rhs) { typedef typename std::conditional< I1 >= I2, Fixed<I1,F>, Fixed<I2,F> >::type T; const T l = T::from_base(lhs.to_raw()); const T r = T::from_base(rhs.to_raw()); return l / r; } template <size_t I, size_t F> std::ostream &operator<<(std::ostream &os, const Fixed<I,F> &f) { os << f.to_double(); return os; } template <size_t I, size_t F> const size_t Fixed<I,F>::fractional_bits; template <size_t I, size_t F> const size_t Fixed<I,F>::integer_bits; template <size_t I, size_t F> const size_t Fixed<I,F>::total_bits; } #endif

Float/doublesの代わりに近いドロップになるように設計されており、選択可能な精度を持っています。 boostを使用して、必要なすべての数学演算子オーバーロードを追加するため、それも必要になります（これは、ライブラリの依存関係ではなく、単なるヘッダーの依存関係だと思います）。

ところで、一般的な使用法は次のようになります。

using namespace numeric; typedef Fixed<16, 16> fixed; fixed f;

唯一の実際の規則は、8、16、32、64などのシステムのネイティブサイズに合計する必要があるということです。

Antti Kissaniemi · Answer

現代のC++実装では、具象クラスなどの単純で無駄のない抽象化を使用してもパフォーマンスが低下することはありません。固定小数点計算は、正確に適切に設計されたクラスを使用することで多くのバグからあなたを救う場所です。

したがって、FixedPoint8クラスを記述する必要があります。徹底的にテストおよびデバッグします。単純な整数を使用する場合と比較して、そのパフォーマンスを納得させる必要がある場合は、測定してください。

固定小数点計算の複雑さを1つの場所に移すことで、多くのトラブルからあなたを救います。

必要に応じて、テンプレートを作成し、古いFixedPoint8をたとえばtypedef FixedPoint<short, 8> FixedPoint8;に置き換えることで、クラスのユーティリティをさらに増やすことができますが、ターゲットアーキテクチャではおそらくこれは必要ないので、最初はテンプレートの複雑さ。

おそらくインターネットのどこかに良い固定小数点クラスがあります-私は Boost ライブラリから探し始めます。

ryu · Answer

浮動小数点コードは実際に小数点を使用していますか？その場合：

最初に、ランディ・イェーツの固定小数点数学入門に関する論文を読む必要があります： http://www.digitalsignallabs.com/fp.pdf

次に、浮動小数点コードを「プロファイリング」して、コードの「クリティカル」ポイントで必要な固定小数点値の適切な範囲を把握する必要があります。 U（5,3）=左に5ビット、右に3ビット、符号なし。

この時点で、上記の論文の算術規則を適用できます。規則は、算術演算の結果として生じるビットの解釈方法を指定します。操作を実行するマクロまたは関数を作成できます。

浮動小数点と固定小数点の結果を比較するには、浮動小数点バージョンを保持しておくと便利です。

Bart · Answer

固定小数点表現の変更は、一般に「スケーリング」と呼ばれます。

パフォーマンスを低下させることなくクラスでこれを行うことができれば、それが道です。コンパイラとインライン化方法に大きく依存します。クラスを使用するとパフォーマンスが低下する場合は、より伝統的なCスタイルのアプローチが必要です。 OOPアプローチは、従来の実装が近似するだけのコンパイラーによる型安全性を提供します。

@cibyrには、良いOOP実装があります。今では、より伝統的な実装です。

どの変数がスケーリングされるかを追跡するには、一貫した規則を使用する必要があります。各変数名の末尾に値をスケーリングするかどうかを示す表記を作成し、x >> 8およびx << 8に展開されるマクロSCALE（）およびUNSCALE（）を記述します。

#define SCALE(x) (x>>8) #define UNSCALE(x) (x<<8) xPositionUnscaled = UNSCALE(10); xPositionScaled = SCALE(xPositionUnscaled);

これほど多くの表記法を使用するのは余分な作業のように思えるかもしれませんが、他の行を見ずにどの行も正しいことを一目でわかるようにする方法に注意してください。例えば：

xPositionScaled = SCALE(xPositionScaled);

検査によって明らかに間違っています。

これは、Apps Hungarianという考え方のバリエーションですこの投稿でJoelが言及している。

paxdiablo · Answer

CPUで浮動小数点を使用するのは、それを処理するための特別なハードウェアなしではまったく使用しません。私のアドバイスは、すべての数値を特定の要素にスケーリングされた整数として扱うことです。たとえば、すべての金銭的価値は、浮動小数点数としてのドルではなく、整数としてのセントです。たとえば、0.72は整数72として表されます。

加算と減算は、（0.72 + 1が72 + 100になり172が1.72になります）などの非常に単純な整数演算です。

乗算は、整数乗算とそれに続くスケールバックが必要なため、少し複雑です（0.72 * 2は72 * 200になります14400は144（スケールバック）は1.44になります）。

これには、より複雑な計算（サイン、コサインなど）を実行するための特別な関数が必要になる場合がありますが、ルックアップテーブルを使用してそれらを高速化することもできます。例：固定2表現を使用しているため、範囲（0.0,1]（0-99）には100個の値しかなく、sin/cosはこの範囲外で繰り返されるため、必要なのは100整数のルックアップテーブルのみです。

乾杯、パックス。

ryan_s · Answer

固定小数点数に最初に出会ったとき、Joe Lemieuxの記事 Cの固定小数点演算が非常に役立ちました。これは、固定小数点値を表す1つの方法を示唆しています。

ただし、固定小数点数に彼のユニオン表現を使用することはありませんでした。私は主にCでの固定小数点の経験があるため、クラスを使用するオプションもありませんでした。ただし、ほとんどの場合、マクロで小数部ビットの数を定義し、説明的な変数名を使用すると、この作業がかなり簡単になると思います。また、乗算および特に除算のためのマクロまたは関数を用意するのが最善であることがわかりました。そうしないと、コードがすぐに読めなくなります。

たとえば、24.8の値の場合：

 #include "stdio.h" /* Declarations for fixed point stuff */ typedef int int_fixed; #define FRACT_BITS 8 #define FIXED_POINT_ONE (1 << FRACT_BITS) #define MAKE_INT_FIXED(x) ((x) << FRACT_BITS) #define MAKE_FLOAT_FIXED(x) ((int_fixed)((x) * FIXED_POINT_ONE)) #define MAKE_FIXED_INT(x) ((x) >> FRACT_BITS) #define MAKE_FIXED_FLOAT(x) (((float)(x)) / FIXED_POINT_ONE) #define FIXED_MULT(x, y) ((x)*(y) >> FRACT_BITS) #define FIXED_DIV(x, y) (((x)<<FRACT_BITS) / (y)) /* tests */ int main() { int_fixed fixed_x = MAKE_FLOAT_FIXED( 4.5f ); int_fixed fixed_y = MAKE_INT_FIXED( 2 ); int_fixed fixed_result = FIXED_MULT( fixed_x, fixed_y ); printf( "%.1f
", MAKE_FIXED_FLOAT( fixed_result ) ); fixed_result = FIXED_DIV( fixed_result, fixed_y ); printf( "%.1f
", MAKE_FIXED_FLOAT( fixed_result ) ); return 0; }

書き出す

 9.0 4.5

これらのマクロにはあらゆる種類の整数オーバーフローの問題があることに注意してください。マクロをシンプルにしたかっただけです。これは、私がCでこれをどのように行ったかの簡単で汚い例に過ぎません。C++では、演算子のオーバーロードを使用して、何かをもっときれいにすることができます。実際、Cコードをもっときれいにすることも簡単にできます...

これは長い言い方だと思います：typedefとマクロのアプローチを使用しても問題ないと思います。どの変数に固定小数点値が含まれているかが明確である限り、保守するのはそれほど難しくありませんが、おそらくC++クラスほどきれいではありません。

私があなたの立場にあれば、ボトルネックがどこにあるかを示すために、いくつかのプロファイリング番号を取得しようとします。それらが比較的少ない場合は、typedefとマクロを使用します。ただし、すべての浮動小数点数を固定小数点演算でグローバルに置換する必要があると判断した場合は、おそらくクラスを使用した方がよいでしょう。

Ana Betts · Answer

Game Programming Gurusのトリックの元のバージョンには、固定小数点演算の実装に関する章全体があります。

jfm3 · Answer

どちらの方法を選んだとしても（変換のためにtypedefといくつかのCPPマクロを使用します）、ある程度の規律を持って前後に変換するように注意する必要があります。

前後に変換する必要がないことに気付くかもしれません。システム全体のすべてがx256であると想像してください。

cibyr · Answer

template <int precision = 8> class FixedPoint { private: int val_; public: inline FixedPoint(int val) : val_ (val << precision) {}; inline operator int() { return val_ >> precision; } // Other operators... };