テンキー配列で最も近い値を見つける

Question

ぎくしゃくした方法がありますか？関数は、配列内の最も近い値を見つけるために？

例：

np.find_nearest( array, value )

unutbu · Accepted Answer

import numpy as np def find_nearest(array, value): array = np.asarray(array) idx = (np.abs(array - value)).argmin() return array[idx] array = np.random.random(10) print(array) # [ 0.21069679 0.61290182 0.63425412 0.84635244 0.91599191 0.00213826 # 0.17104965 0.56874386 0.57319379 0.28719469] value = 0.5 print(find_nearest(array, value)) # 0.568743859261

Demitri · Answer

IFあなたの配列はソートされていてとても大きいので、これはずっと速い解決法です：

def find_nearest(array,value): idx = np.searchsorted(array, value, side="left") if idx > 0 and (idx == len(array) or math.fabs(value - array[idx-1]) < math.fabs(value - array[idx])): return array[idx-1] else: return array[idx]

これは非常に大きな配列に対応します。配列がすでにソートされていると想定できない場合は、上記を簡単に変更してメソッドをソートすることができます。小さい配列ではやり過ぎですが、一度大きくなるとこれははるかに速くなります。

kwgoodman · Answer

少し修正すると、上記の答えは任意の次元（1d、2d、3d、...）の配列で機能します。

def find_nearest(a, a0): "Element in nd array `a` closest to the scalar value `a0`" idx = np.abs(a - a0).argmin() return a.flat[idx]

または、単一行として記述します。

a.flat[np.abs(a - a0).argmin()]

Josh Albert · Answer

回答の要約：もしソートされたarrayがあれば、（以下に与えられた）二分法コードが最も速く実行されます。大規模アレイの場合は最大100〜1000倍、小規模アレイの場合は最大2〜100倍高速です。でんぷんも必要ありません。ソートされていないarrayがある場合、arrayが大きい場合は、まずO（n logn）ソートを使用してから二等分することを検討し、arrayが小さい場合は方法2が最速になります。

最初にあなたが最も近い値であなたが何を意味するのかを明確にするべきです。多くの場合、横軸の間隔を望みます。 array = [0,0.7,2.1]、value = 1.95、答えはidx = 1になります。これは私があなたが必要と疑うケースです（さもなければあなたが間隔を見つけたらフォローアップ条件文で非常に簡単に以下を変更することができます）。これを実行するための最適な方法は2分割であることに注意してください（最初に提供します。完全にnumpyを必要とせず、冗長演算を実行するためnumpy関数を使用するよりも高速です）。それから私はここで他のユーザーによって提示された他の人たちとのタイミング比較を提供するつもりです。

二等分：

def bisection(array,value): '''Given an ``array`` , and given a ``value`` , returns an index j such that ``value`` is between array[j] and array[j+1]. ``array`` must be monotonic increasing. j=-1 or j=len(array) is returned to indicate that ``value`` is out of range below and above respectively.''' n = len(array) if (value < array[0]): return -1 Elif (value > array[n-1]): return n jl = 0# Initialize lower ju = n-1# and upper limits. while (ju-jl > 1):# If we are not yet done, jm=(ju+jl) >> 1# compute a midpoint with a bitshift if (value >= array[jm]): jl=jm# and replace either the lower limit else: ju=jm# or the upper limit, as appropriate. # Repeat until the test condition is satisfied. if (value == array[0]):# Edge cases at bottom return 0 Elif (value == array[n-1]):# and top return n-1 else: return jl

それでは、他の答えからコードを定義しましょう。それぞれがインデックスを返します。

import math import numpy as np def find_nearest1(array,value): idx,val = min(enumerate(array), key=lambda x: abs(x[1]-value)) return idx def find_nearest2(array, values): indices = np.abs(np.subtract.outer(array, values)).argmin(0) return indices def find_nearest3(array, values): values = np.atleast_1d(values) indices = np.abs(np.int64(np.subtract.outer(array, values))).argmin(0) out = array[indices] return indices def find_nearest4(array,value): idx = (np.abs(array-value)).argmin() return idx def find_nearest5(array, value): idx_sorted = np.argsort(array) sorted_array = np.array(array[idx_sorted]) idx = np.searchsorted(sorted_array, value, side="left") if idx >= len(array): idx_nearest = idx_sorted[len(array)-1] Elif idx == 0: idx_nearest = idx_sorted[0] else: if abs(value - sorted_array[idx-1]) < abs(value - sorted_array[idx]): idx_nearest = idx_sorted[idx-1] else: idx_nearest = idx_sorted[idx] return idx_nearest def find_nearest6(array,value): xi = np.argmin(np.abs(np.ceil(array[None].T - value)),axis=0) return xi

今度はコードを計時します。注メソッド1、2、4、5は正しく間隔を指定しません。方法１、２、４は、配列内の最も近い点に丸め（例えば、≧１．５→２）、方法５は、常に切り上げる（例えば、１．４５→２）。方法3と6、そしてもちろん二等分だけが適切に間隔を与えます。

array = np.arange(100000) val = array[50000]+0.55 print( bisection(array,val)) %timeit bisection(array,val) print( find_nearest1(array,val)) %timeit find_nearest1(array,val) print( find_nearest2(array,val)) %timeit find_nearest2(array,val) print( find_nearest3(array,val)) %timeit find_nearest3(array,val) print( find_nearest4(array,val)) %timeit find_nearest4(array,val) print( find_nearest5(array,val)) %timeit find_nearest5(array,val) print( find_nearest6(array,val)) %timeit find_nearest6(array,val) (50000, 50000) 100000 loops, best of 3: 4.4 µs per loop 50001 1 loop, best of 3: 180 ms per loop 50001 1000 loops, best of 3: 267 µs per loop [50000] 1000 loops, best of 3: 390 µs per loop 50001 1000 loops, best of 3: 259 µs per loop 50001 1000 loops, best of 3: 1.21 ms per loop [50000] 1000 loops, best of 3: 746 µs per loop

大規模なアレイでは、二等分すると次の最高の180usおよび最長の1.21msと比較して4usになります（約100 - 1000倍高速）。小さなアレイでは、2〜100倍高速です。

Onasafari · Answer

これはベクトルの配列の中で最も近いベクトルを見つけるための拡張です。

import numpy as np def find_nearest_vector(array, value): idx = np.array([np.linalg.norm(x+y) for (x,y) in array-value]).argmin() return array[idx] A = np.random.random((10,2))*100 """ A = array([[ 34.19762933, 43.14534123], [ 48.79558706, 47.79243283], [ 38.42774411, 84.87155478], [ 63.64371943, 50.7722317 ], [ 73.56362857, 27.87895698], [ 96.67790593, 77.76150486], [ 68.86202147, 21.38735169], [ 5.21796467, 59.17051276], [ 82.92389467, 99.90387851], [ 6.76626539, 30.50661753]])""" pt = [6, 30] print find_nearest_vector(A,pt) # array([ 6.76626539, 30.50661753])

ryggyr · Answer

これはスカラーでない "values"配列を扱うバージョンです：

import numpy as np def find_nearest(array, values): indices = np.abs(np.subtract.outer(array, values)).argmin(0) return array[indices]

入力がスカラの場合、数値型（int、floatなど）を返すバージョンもあります。

def find_nearest(array, values): values = np.atleast_1d(values) indices = np.abs(np.subtract.outer(array, values)).argmin(0) out = array[indices] return out if len(out) > 1 else out[0]

Nick Crawford · Answer

あなたがNumpyを使いたくないなら、これはそれをするでしょう：

def find_nearest(array, value): n = [abs(i-value) for i in array] idx = n.index(min(n)) return array[idx]

efirvida · Answer

これは@Ari Onasafariのscipy付きのバージョンです。 "と答えて、ベクトルの配列の中で最も近いベクトルを見つけます"

In [1]: from scipy import spatial In [2]: import numpy as np In [3]: A = np.random.random((10,2))*100 In [4]: A Out[4]: array([[ 68.83402637, 38.07632221], [ 76.84704074, 24.9395109 ], [ 16.26715795, 98.52763827], [ 70.99411985, 67.31740151], [ 71.72452181, 24.13516764], [ 17.22707611, 20.65425362], [ 43.85122458, 21.50624882], [ 76.71987125, 44.95031274], [ 63.77341073, 78.87417774], [ 8.45828909, 30.18426696]]) In [5]: pt = [6, 30] # <-- the point to find In [6]: A[spatial.KDTree(A).query(pt)[1]] # <-- the nearest point Out[6]: array([ 8.45828909, 30.18426696]) #how it works! In [7]: distance,index = spatial.KDTree(A).query(pt) In [8]: distance # <-- The distances to the nearest neighbors Out[8]: 2.4651855048258393 In [9]: index # <-- The locations of the neighbors Out[9]: 9 #then In [10]: A[index] Out[10]: array([ 8.45828909, 30.18426696])

aph · Answer

大規模な配列では、@ Deemitriによる（優れた）回答は、現在最良とマークされている回答よりはるかに高速です。私は以下の2つの方法で彼の正確なアルゴリズムを修正しました：

以下の関数は、入力配列がソートされているかどうかにかかわらず動作します。
以下の関数は最も近い値に対応する入力配列のindexを返します。これはもう少し一般的です。

以下の関数は@Demitriによって書かれた元の関数のバグにつながる特定のEdgeのケースも扱います。それ以外の点では、私のアルゴリズムは彼のものと同じです。

def find_idx_nearest_val(array, value): idx_sorted = np.argsort(array) sorted_array = np.array(array[idx_sorted]) idx = np.searchsorted(sorted_array, value, side="left") if idx >= len(array): idx_nearest = idx_sorted[len(array)-1] Elif idx == 0: idx_nearest = idx_sorted[0] else: if abs(value - sorted_array[idx-1]) < abs(value - sorted_array[idx]): idx_nearest = idx_sorted[idx-1] else: idx_nearest = idx_sorted[idx] return idx_nearest

anthonybell · Answer

検索するvaluesが多数ある場合の@ Dimitriのソリューションを高速ベクトル化したものです（valuesは多次元配列にすることができます）。

#`values` should be sorted def get_closest(array, values): #make sure array is a numpy array array = np.array(array) # get insert positions idxs = np.searchsorted(array, values, side="left") # find indexes where previous index is closer prev_idx_is_less = ((idxs == len(array))|(np.fabs(values - array[np.maximum(idxs-1, 0)]) < np.fabs(values - array[np.minimum(idxs, len(array)-1)]))) idxs[prev_idx_is_less] -= 1 return array[idxs]

ベンチマーク

@ Demitriのソリューションでforループを使うよりも100倍速い

>>> %timeit ar=get_closest(np.linspace(1, 1000, 100), np.random.randint(0, 1050, (1000, 1000))) 139 ms ± 4.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) >>> %timeit ar=[find_nearest(np.linspace(1, 1000, 100), value) for value in np.random.randint(0, 1050, 1000*1000)] took 21.4 seconds

Zhanwen Chen · Answer

これは nutbu's answer のベクトル版です。

def find_nearest(array, values): array = np.asarray(array) # the last dim must be 1 to broadcast in (array - values) below. values = np.expand_dims(values, axis=-1) indices = np.abs(array - values).argmin(axis=-1) return array[indices] image = plt.imread('example_3_band_image.jpg') print(image.shape) # should be (nrows, ncols, 3) quantiles = np.linspace(0, 255, num=2 ** 2, dtype=np.uint8) quantiled_image = find_nearest(quantiles, image) print(quantiled_image.shape) # should be (nrows, ncols, 3)

Soumen · Answer

すべての答えは、効率的なコードを書くために情報を集めるのに有益です。しかし、私はさまざまな場合に最適化するための小さなPythonスクリプトを書きました。提供された配列がソートされている場合はこれが最善の方法です。指定された値の最も近い点のインデックスを検索する場合、bisectモジュールが最も時間効率が良いです。 1回の検索でインデックスが配列に対応する場合、numpy searchsortedが最も効率的です。

import numpy as np import bisect xarr = np.random.Rand(int(1e7)) srt_ind = xarr.argsort() xar = xarr.copy()[srt_ind] xlist = xar.tolist() bisect.bisect_left(xlist, 0.3)

In [63]：％time bisect.bisect_left（xlist、0.3）CPU時間：ユーザー0 ns、sys：0 ns、合計：0 nsウォールタイム：22.2 µs

np.searchsorted(xar, 0.3, side="left")

In [64]：％time np.searchsorted（xar、0.3、side = "left"）CPU時間：ユーザー0 ns、sys：0 ns、合計：0 nsウォールタイム：98.9 µs

randpts = np.random.Rand(1000) np.searchsorted(xar, randpts, side="left")

％time np.searchsorted（xar、randpts、side = "left"）CPU時間：ユーザー4ミリ秒、sys：0 ns、合計：4 ms壁時間：1.2 ms

乗法則に従うと、numpyは100ミリ秒かかるはずです。これは、83倍高速であることを意味します。

Ishan Tomar · Answer

私はもっともPythonicな方法は次のようになると思います。

 num = 65 # Input number array = n.random.random((10))*100 # Given array nearest_idx = n.where(abs(array-num)==abs(array-num).min())[0] # If you want the index of the element of array (array) nearest to the the given number (num) nearest_val = array[abs(array-num)==abs(array-num).min()] # If you directly want the element of array (array) nearest to the given number (num)

これが基本コードです。あなたが望むなら関数としてそれを使うことができます

Gusev Slava · Answer

ndarraysにはおそらく役に立つでしょう：

def find_nearest(X, value): return X[np.unravel_index(np.argmin(np.abs(X - value)), X.shape)]

Eduardo S. Pereira · Answer

2次元配列の場合、最も近い要素のi、j位置を決定するには：

import numpy as np def find_nearest(a, a0): idx = (np.abs(a - a0)).argmin() w = a.shape[1] i = idx // w j = idx - i * w return a[i,j], i, j

kareem mohamed · Answer

import numpy as np def find_nearest(array, value): array = np.array(array) z=np.abs(array-value) y= np.where(z == z.min()) m=np.array(y) x=m[0,0] y=m[1,0] near_value=array[x,y] return near_value array =np.array([[60,200,30],[3,30,50],[20,1,-50],[20,-500,11]]) print(array) value = 0 print(find_nearest(array, value))