行政院國家科學委員會專題研究計畫 成果報告
以有限張數的場景影像為依據作為覆蓋建築物內全域動線 的視覺導航系統
研究成果報告(精簡版)
計 畫 類 別 : 個別型
計 畫 編 號 : NSC 99-2221-E-011-134-
執 行 期 間 : 99 年 08 月 01 日至 100 年 07 月 31 日 執 行 單 位 : 國立臺灣科技大學電機工程系
計 畫 主 持 人 : 鍾聖倫 共 同 主 持 人 : 項天瑞
計畫參與人員: 碩士班研究生-兼任助理人員:廖士辰 博士班研究生-兼任助理人員:傅宇
報 告 附 件 : 出席國際會議研究心得報告及發表論文
處 理 方 式 : 本計畫可公開查詢
中 華 民 國 100 年 10 月 20 日
行政院國家科學委員會補助專題研究計畫 █成果報告
□期中進度報告
以有限張數的場景影像為依據作為覆蓋建築物內全域動線 的視覺導航系統
計畫類別:█ 個別型計畫 □整合型計畫
計畫編號:NSC 99 - 2221 - E - 100 - 134 - 執行期間:99 年 08 月 01 日至 100 年 07 月 31 日
執行機構及系所:國立臺灣科技大學電機工程學系†.資訊工程學系‡
計畫主持人:鍾聖倫† 教授 共同主持人:項天瑞‡ 助理教授 計畫參與人員:傅宇、廖士辰
成果報告類型(依經費核定清單規定繳交):█精簡報告 □完整報告
本計畫除繳交成果報告外,另須繳交以下出國心得報告:
□赴國外出差或研習心得報告
□赴大陸地區出差或研習心得報告
█出席國際學術會議心得報告
□國際合作研究計畫國外研究報告
處理方式:除列管計畫及下列情形者外,得立即公開查詢
█涉及專利或其他智慧財產權,□一年□二年後可公開查詢 中 華 民 國 100 年 09 月 15 日
附件一
行政院國家科學委員會專題研究計畫成果報告
以有限張數的場景影像為依據作為覆蓋建築物內 全域動線的視覺導航系統
國科會計畫編號:NSC 99-2221-E-011-134
執行期間: 民國九十九年八月一日至一百年十月卅一日 執行機構: 國立臺灣科技大學電機工程學系†.資訊工程學系‡
主持人:鍾聖倫† 教授 共同主持人: 項天瑞‡ 助理教授
摘要—本研究計畫針對自主機器人提出一個基於影像序 列的視覺導航系統。系統先取得表示著導航路線的影像序列。當 導航時,兩種導航模式根據著機器人當下影像以及影像序列中最 接近參考影像之間尺度差異來做切換,以減少導航時間。當尺度 差異很大時,log-polar 轉換可以用來找到兩張影像之間的對應,
此時機器人以一種較快但較不精準的導航模式來移動直到接近 目標航點。當機器人由於往航點移動而漸漸將當下影像和參考影 像的尺度差異減少後,一種較精確但慢速的導航模式將啟用。相 較於其他研究以較慢(10 公分/秒)的速度下導航,本論文提出的方 法可以在較快的平均速度(26.5 公分/秒)下導航而不失其導航精準 度,此外也減少了表示一個路線所需要的參考影像張數。
關鍵字-視覺導航;SIFT;log-polar影像轉換 I. 介紹
為了使機器人在工作環境下能夠自主的移 動,機器人需要以幾何或拓撲的方式去建立環境 地圖模型,不同的導航技術幾乎都是根據這兩種 地圖所發展的。
本篇論文中主要研究拓撲式的導航技術,特 別是基於影像序列的視覺導航技術[1]-[6]。這項 導航技術的靈感源自於人類與動物,在沒有真正 公制化的地圖資訊下能夠經由儲存所看到的影像 序列來當作導航的資訊進行導航的動作。一開始 機器人需要被指引一條路線來建立出工作環境的 影像序列,建立完成後,便可以在相同的路線下 做自主導航。在本篇論文中,在影像序列裡的每 一張影像都稱為參考影像而參考影像的位置點被 稱為航點。
A. 相關工作
在過去已經有很多基於影像序列的導航技術 已被提出。Courbon 等人[1]及 Lopez-Nicolas 等人
[2] 設 計 了 在 有 限 視 野 (Field Of View) 與 non-holonmoic 的限制下兩個連續航點的導航方 法。Fu 等人[3] 用 epipolar 幾何提出了一個多航 點的導航技術用來導引機器人從某一航點到一個 不相鄰的航點。Ido 等人[4]修改了對於人型機器 人的影像序列導航方法,特別是針對移動時物體 所產生的模糊影像做偵測及過濾處理。Chen 和 Birchfield[5]從參考影像上追蹤角點,因此在建立 影像序列時只要有當被追蹤角點數減少至原本的 一半時才需抓取一個新的參考影像。除了影像序 列外,從里程計估計的相對位置也可以用來協助 連續航點間的導航。一個更精確的導航角可以由 當時影像與參考影像間的差異以及里程計共同推 論 出 來 。 Lopez-Nicolas 等 人 [6] 設 計 一 個 用 epipolar 幾何與 homography 從一航點到鄰近的航 點的導航方法,透過切換估算計算導航角的來 源,epipolar 幾何或 homography,可以避免當特 徵共平面等情況所發生 epipolar 幾何退化的情 形,因此此方法能在較不嚴格限制的環境中應用。
上述方法僅有少數能應用在真實高速行走下 的機器人上。Ido 等人[4]以及 Chen 和 Birchfield[5]
的導航方法下,機器人的速度上限為 10 公分/
秒,Lopez-Nicolas 等人[6]則在 4 公分/秒的最高 速度下成功地導航。
B. 提出的導航方法摘要
在本論文中,為了減少導航時間與參考影像 的數量,我們在之前的工作[3]上加入了 log-polar 轉換[7][8]的影像比對方法來找到較大尺度差異
的兩張影像間的對應區域,特別是針對當 SIFT 特徵無法被正確的對應時。當機器人往航點導航 時,導航的行為被分為兩種模式:一種為能快速移 動機器人但犧牲導航準確度且採用 log-polar 轉換 的導航模式,另一種為採用 SIFT 特徵比對較精 準但緩慢的導航模式。本論文提出的導航方法在 兩輪的機器人上裝載著朝向前端的一般攝影機做 了測試。實驗結果顯示機器人在高速移動下可以 減少過去工作的一半導航時間,而其精準度與之 前的工作[3]是一樣的。
C. 貢獻
本論文的貢獻如下:
本論文提出的導航系統能夠在速度為 26.5 公分/秒
的速地下移動,比其他在室內環境下的導航研究都 來的快速,因此可協助機器人能用更短的時間達到 目的地。
比較使用 log-polar 轉換的影像對應法與使用 SIFT
特徵的對應法,相同路徑下所需的參考影像數量可 以減少。
D. 論文架構
本論文其餘的部分架構如下:第二章為技術 性問題的定義;第三章說明使用 log-polar 轉換的 影像對應法並詳述導航系統以及建立參考影像的 步驟;第四章提供實驗結果來說明提出導航系統 的好壞;第五章為結論。
II. 問題定義
本論文提出之基於影像序列的導航系統主要 目的為減少相同導航路線下參考影像的數量以及 導航時間。大部份影像序列的導航系統中常常被 研究的是用於兩連續航點間導航的視覺歸位。
Mariottini 等人[10]由目前影像特徵位置與在參考 圖像上對應的特徵位置距離差的大小平方來決定 機器人移動速度。Lopez Nicolas 等人[11]設計了 多個階段的視覺歸位演算法,一開始機器人先快 速的移動至航點的後方,接著用緩慢且恆定的速 度往航點移動。Fu 等人[3]使用一個比例控制器根 據對應特徵位置垂直方向差異的平均來決定移動 速度。當上述任一視覺歸位的演算法被使用在多 航點的路徑上時,由於靠近每一個航點時都要減 速,因此相同長度的路徑下,越多的航點表示著 導航時間越長。
航點可以根據其功能分成服務航點,交界航
點以及中途航點。服務航點代表一個機器人提供 服務的重要位置,例如一個運輸機器人提取或卸 載物品的地點,以及在博物館的導引機器人的資 訊服務處。交界航點像是一個交叉路口或 T 字路 口上兩直線路線的交界處。中途航點則是用在上 述兩種航點間,這兩航點間距離過長而無法使用 一次的視覺歸位來導航時,便需要在其間適量地 加入中途航點。使用不同的視覺歸位演算法會使 得中途航點的數量以及位置有所改變。圖 1 舉了 一個在建築物裡內航點位置的例子。
透過設計一個能使機器人從較遠離目的航點 的視覺歸位演算法,可以減少所需的相同長度的 路徑上所需中途航點的數量,進而加快導航速 度。對於視覺歸位演算法[3][12]而言,相鄰航點 間必須確保著一個最少的對應特徵數量,才能從 特徵對應中計算出 epipolar 幾何與 homography 進 而推算導航角。在這個最少的對應特徵數量限制 下,機器人僅有在目標航點後方的一個有限範圍 區域內才可以保證採用視覺歸位演算法來導航制 目標航點。圖 2 是一對尺度差異過大的影像。在 這個例子中由於過大的尺度差異,即便採用 SIFT 特 徵 也 無 法 找 到 足 夠 對 應 的 特 徵 點 來 計 算 epipolar 幾何。因此在這個例子中,至少還需要 一個中途航點才可以順利地採用視覺歸位演算法 來導航。
圖 1 大樓群內的航點位置
圖 2 尺度差異過大下 SIFT 特徵比對的結果
由於大多數基於影像特徵的視覺歸位演算法 都採用 SIFT 特徵或 SURF 特徵,本論文將減少 基於 SIFT 特徵的視覺歸位演算法所需要的中途 航點。如此所提出來的方法便可以擴充到多數基 於 SIFT 特徵的視覺歸位以算法[3][11]。為了要這 樣達到這樣的效果,我們需要一個在較高尺度差 異的影像比對方法以及設計一個不影響導航精準 度的導航演算法。
III. 導航演算法
在這一節中將會詳細敘述基於影像序列的導 航系統其中包含了:影像比對的方式,基於兩種 模式的局部視覺歸位方法,以及如何建立參考影 像 來 表 示 一 條 路 線 。 影 像 比 對 的 方 法 是 使 用 log-polar 轉換[7][8]。此比對方法在兩張影像間尺 度差距很大時仍可以找到對應的區域。當基於 SIFT 特徵的比對方法不能找到機器人當下影像 與參考影像的對應時將採用 log-polar 轉換。據 此,本論文提出的視覺歸位有兩種模式:當機器 人開始導航的時候移動速度較快但精確度較低的 模式,以及當機器人接近目標航點時移動速度較 慢但是精確度較高的模式。透過反覆的運行局部 視覺歸位的演算法以及對應某路徑的參考影像序 列,機器人便可以從任一航點導航至一不相鄰的 較遠航點。
A. 建立參考影像序列以及影像比對的方法
Log-polar 轉換是一種代表人類和靈長類動 物的在視覺皮層上視網膜的模型[7],光感受器的 密度由密度最高的眼睛中央往外以指數的方式逐 漸降低。由於 log-polar 轉換和生物結構的相似性 以及旋轉尺度不變的特性,因此其被廣泛地運用 在電腦視覺中。Zokai 和 Wolberg[8]建議利用 log-polar 轉換來進行影像拼接並且使用影像金字 塔來降低計算量。Traver 和 Bernardino[14]有針對
log-polar 轉換在機器人領域中的應用做探討。
所謂的 log-polar 轉換即是將一張影像從笛卡 爾座標系轉換至 log-polar 座標系。假設給一張參 考笛卡爾座標的灰階影像 I,在此影像上以圓心 (uc, vc)半徑 rMax畫出一個圓,圓內每一個 pixel (u, v)都利用下式轉換到 log-polar 座標系。
( 1 ) 其詳細實做細節,以及轉換時所使用的參數 皆可在[8]中找到。圖片 3 左邊的是一張 320x240 在笛 卡 爾座標 系的影像, 其中心 點 (uc, vc) 是 (160,120)而 rMax 是 80 pixels,而右邊則是他的 log-polar。其中的紅色矩行是表示在兩張影像中 相對應的位置。
圖 3 在笛卡爾座標系以及 log-polar 座標系的影像
一個 log-polar 轉換的主要優點是其對於旋轉 和縮放的不變性,把影像 I 分別沿著θ軸進行旋 轉與沿著 r 軸縮小後得到 IRot和 IScale和 I 比對後 可以得出旋轉或是縮放的量。
兩個影像使用 log-polar 轉換來對應的方式可 以參考 Algorithm 1。給定第二張影像 I2中心點 (Uc,2 , Vc,2)和半徑 rMax,接著就可利用 log-polar 轉換求出 ILp,2。Normalized Cross Correlation (NCC)是透過沿著θ軸和 r 軸平移後計算 ILp,l 和 ILp,2 中互相重疊的區域相似度的方法。由於最大 的相關值是在四維的空間中搜尋 (u, v, Δr, Δ θ),因此這樣的計算量在不利於即時的運算。
一個改近的方法是使用影像金字塔[8]。首先 將影像 I1 和 I2 進行縮放以建立影像金字塔,對 應步驟會先在影像金字塔最粗略那層中大略的搜 尋最大 NCC 的像素區域位置,接著可以在下一
層的對應位置去尋找最大的相關數,重複執行,
直到回到原圖。有了這個局部搜尋的方法,2 張 640x480 的影像對應在[8]僅需要 20 秒。
由於參照的影像和機器人的視角間的對應只 要在 3 維空間(u, v, Δθ)中尋找,2 張 320x240 的 影像對應所花費時間將減至 0.3 秒。圖 4 所舉的 例子則是採用 log-polar 轉換做影像對應的結果。
圖 5 尺度差異過大下 log-polar 比對的結果
B. 透過log-polar轉換和epipolar幾何的影像序列導航系
統
在相連航點間,我們採用了四個階段的局部 視覺歸位來導航。該機器人的運動軌跡每一階段 如圖 6 所示。在這四個階段中主要的任務是使機 器人能夠到達 waypoint 時保持與拍攝參考影像一 致的面向,使用基於 SIFT 視覺歸位 的方法配 合 log-polar 轉換得以使導航的速度加快並無損其 精準度。
圖 6 局部視覺歸位時機器人的路徑
階段 1 - 旋轉到面對航點的方向:機器人會 先概略地旋轉到下一個 waypoint 的方向。因為攝 影機的視角是有限的,因此先找尋到與參考影像 相似的影像才能找尋影像中的對應。旋轉角度可 以參考在交導路徑時的里程計。
階段 2 - log-polar 轉換時的快速動作:機器 人會以較快的動作但正確率較低的模式,來降低 機器人視野和參照影像的差值。由於兩張影像的 差距太大而無法採用 SIFT 特徵對應, epipolar 幾何是不可求的。機器人僅能使用 log-polar 轉換 所得出的影像對應結果粗略推論出移動方向來降 低機器人與航點的距離以及影像差異。
假設使用 log-polar 轉換的對應區域中心座 標為(UCur, VCur)(URef, VRef)所得出的結果,其尺度 差值以 Δs 代表。機器人導航的速度設計為正比 於 Δs,而角速度 ω 則正比於對應座標的橫向差 異。
( 2 ) 其中 Kω 代表一個控制的增益。機器人不一定會 隨著這個被計算出的角速度朝著航點的方向移 動,但[4][5]已經證明出機器人最後會在航點上收 斂,只要其移動路徑夠長。
移動速度 VLog_Polar與Δs 成正比,
( 3 )
,其中 KLogPolar代表控制增益,Θs 代表一個門檻 值,用來判斷差值是否過大, VBaseLogPolar則是表 示最小的移動速度。KLogPolar通常設成 40,Θs 設 在 1.5 和 2.5 之間,VBaseLogPolar則設為每秒 40 公 分。
當 Δs 變小時機器人會從階段 2 切換成階段 3。當機器人在導航的時候會隔一段時間就判斷一 次Δs 是否小於或等於 Θs,透過多次檢查可以避 免因為偶然的錯誤使得機器人過早從階段 2 切換 成階段 3。
階段 3 - 利用 epipolar 幾何的精確移動: 在 第三階段中為了到達航點,機器人採用精確但緩 慢地移動,此時因為機器人與航點的距離已經足 以找到機器人視野與參考圖像間的 SIFT 對應特 徵,並計算 epipolar 幾何決定導航角。
( 4 ) ( 5 )
其中 KEpipolar 與 VBaseEpiPolar 分別設為 2 公尺每秒
和 1 公分每秒。當|OFv|小到極限時,則機器人停 止移動。有關在航點上經由最小|OFv|的局部精準 化的詳細證明在先前的論文[3]。
階段 4 - 對齊特徵視覺的精準航行: 在機器 人到達航點後,所航行的方向會經由最小化|OFu| 而有所調整,兩張在相同位置與方向被照下的影 像,其特徵位置都會相同。換句話說,垂直成分 OFv與水平成分 OFu都會等於 0。在第三階段,
OFv 變為最小值後,機器人轉到最小化|OFu|於第 四階段中,旋轉的角度是根據|OFu|所決定,如果
|OFu|小於門檻值 ΘOFu則停止旋轉。
為了提升導航速度,當目標航點是中途航點 時階段 4 可以被省略不做。因為機器人目前方向 與下個航點的方向差距是很小的,機器人省略階 段 4 並直接針對下張參考圖像做視覺歸位。這將 會減少導航時間並使機器人緩慢的通過中途航點 而不會停下來調整航行頭向。
C. 建立參考影像序列的方法
經由教導機器人導航路徑的同時拍攝參考圖 像的候選,參考影像序列可以從這些候選中選取 來建立。首先在服務航點和交界航點的影像因為 其功能,因此一定要選取。
在兩兩服務航點和交界航點中,如果機器人 離目標航點太遠則會需要一個以上的中途航點。
需要的數量是依據影像對應法來決定,在此使用 log-polar 轉換的對應方法能夠達到較少的參考 影像。
圖 7 顯示從航點 A 到 B 參考圖像的建立程 序,一開始機器人沿著路線航行並擷取候選參考 圖像,虛線圖與實現圖分別代表候選參考圖像與 已選定的參考圖像,如圖 7 (a)所示,候選參考圖 像在服務航點 A 與交界航點 B 優先被選取為參考 圖像。而圖 7(b)上顯示 log-polar 轉換的對應結
果,根據機器人遠離航點 B 其圓形面積逐漸變小 的特性可以判斷出錯誤的對應,一旦出現錯誤的 對應結果將標記為叉。導致對應錯誤的候選影像 會被設定為參考影像且所對應的位置會被設定為 中途航點,圖 7(c)顯示出參考影像與航點的最後 結果。
圖 7 建立參考影像的過程
IV. 效能評估
本節將會透過提出的方法與[3]來進行一個實 驗比較,以證明將 log-polar 轉換加入後的改善效 果。在此實驗中路徑如圖 1 所示,從服務的 A 點 開始經過中途點 B 然後到達服務的 C 點,其路徑 總長度為 117.3 公尺。本實驗的控制增益以及參 數的設定如所示。
在此實驗中機器人在最後導航結果偏差小於 15 公分內走完全部路徑,共花費了 442 秒。會在 中間建立 7 個參考影像,其中只有 4 個必要的參 考影像。機器人從 A 到 B 點的移動速度記錄在圖 8 之中。當機器人到達中間點和 B 點時分別是第 118 張和第 182 張影像。從第 1 張到 73 張的影像 是機器人是在階段 2 且平均速度是 44 公分每秒時 拍攝的,然後當Δs 小於 Ɵs 且經過一段的確認時 間後,機器人切換到階段 3 並且移動速度降至 17
公分每秒。
表 1 控制增益以及參數的設定
圖 8 導航時的速度變化
採用[3](基於從 sift 特徵點計算 epipolar 幾何 的方法)拿來於相同路徑下做實驗比較。我們提 出的方法需要 11 張參考影像,而[3]需要 35 張參 考影像。提出的方法導航機器人需要 442 秒但是 [3]方法卻需要 1054 秒。實驗結果於表 2。
表 2 長路徑下的實驗以及與[3]比較
V. 結論
本論文提出基於序列的視覺導航系統。當一 個機器人使用參考影像序列進行導航時,當機器 人在往航點靠近時會先以移動速度較快但精準度 較低的模式行進,而當接近目標點的時候則採用 相反的模式行進。機器人在 11 張參考影像以及 117.3 公尺的路徑下其平均速度為 26.5 公分每 秒。與其他的方法相比,提出的方法需要的參考 影像更少而且可以有更快的導航速度。
提出的導航系統在兩方面需要改進。依賴人類來判斷從 log-polar 得出的對應結果是否為正確的並決定參考影像序 列該如何選取,因此當環境更大且更複雜的時候,參考影 像 的 建 立 變 成 一 個 相 當 耗 時 的 工 作 。 此 外 , 當 使 用 log-polar 轉換時,由於動態物體在視野中的移動而導致錯 誤的對應將會影響導航。未來工作將會從以上的兩個問題 來下手。
參考文獻
[1] J. Courbon, Y. Mezouar, and P. Martinet, “Indoor navigation of a non-holonomic mobile robot using a visual memory,” Autonomous Robots, vol. 25, no. 3, pp. 253-266, 2008.
[2] G. Lopez-Nicolas, N. R. Gans, S. Bhattacharya, C. Sagues, J. J.
Guerrero, and S. Hutchinson, “Homography-based control scheme for mobile robots with nonholonomic and field-of-view constraints," IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. pp, no. 99, pp. 1-13,2009.
[3] Y. Fu, T.-R. Hsiang, and S.-L. Chung, "Robot navigation using image sequences," in Proceedings of the 6th International Conference on Ubiquitous Robots and Ambient Intelligence, Gwangju, Korea, Oct. 29-31 2009, pp. 163-167.
[4] J. Ido, Y. Shimizu, Y. Matsumoto, and T. Ogasawara, "Indoor navigation for a humanoid robot using a view sequence," The International Journal of Robotics Research, vol. 28, no. 2, pp.
315-325, 2009.
[5] Z. Chen and S. T. Birchfield, "Qualitative vision-based path following," IEEE Transactions on Robotics, vol. 25, no. 3, pp.
749-754,2009.
[6] G. Lopez-Nicolas, J. J. Guerrero, and C. Sagues, "Visual control of vehicles using two-view geometry;' Mechatronics, 2010, to be published.
[7] E. L. Schwartz, "Computational anatomy and functional architecture of the striate cortex: a spatial mapping approach to perceptual coding," Vision Research, vol. 20, pp.
645-669, 1980.
[8] S. Zokai and G. Wolberg, "Image registration using log-polar mappings for recovery of large-scale similarity and projective transformations," IEEE Transactions on Image Processing, vol. 14, no. 10, pp. 1422-1434, 2005.
[9] D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, vol. 60, no.
2, pp. 91-110, 2004.
[10] G. L. Mariottini, G. Oriolo, and D. Prattichizzo, "Image-based visual servoing for nonholonornic mobile robots using epipolar geometry," IEEE Transactions on Robotics, vol. 23, no. 1, pp.
87-100,2007.
[11] Lopez-Nicolas, C. Sagues, I. I. Guerrero, D. Kragic, and P.
Iensfelt, "Switching visual control based on epipoles for mobile robots:' Robotics and Autonomous Systems, vol. 56, no. 7, pp. 592-603, 2008.
[12] D. Fontanelli, A. Danesi, F. A. W. Belo, P. Salaris, and A. Bicchi,
"Visual servoing in the large," The International Journal of Robotics Research, vol. 28, no. 6, pp. 802-814, 2009.
[13] H. Bay, T. Tuytelaars, and L. V. Gool, "Surf: Speeded up robust features," in Proceedings of the 9th European Conference on Computer Vision, vol. 3951, Graz, Austria, May 7-13 2006, pp.
404-417.
[14] V. I. Traver and A. Bernardino, "A review of log-polar imaging for visual perception in robotics," Robotics and Autonomous Systems, vol.
58, no. 4, pp. 378-398, 2010.
8
國科會補助專題研究計畫成果報告自評表
請就研究內容與原計畫相符程度、達成預期目標情況、研究成果之學術或應用價 值(簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性) 、是否適 合在學術期刊發表或申請專利、主要發現或其他有關價值等,作一綜合評估。
1. 請就研究內容與原計畫相符程度、達成預期目標情況作一綜合評估
█
達成目標
□ 未達成目標(請說明,以 100 字為限)
□ 實驗失敗
□ 因故實驗中斷
□ 其他原因
說明:本研究計畫針對自主機器人提出一個基於影像序列的視覺導航系統。系 統先取得表示著導航路線的影像序列。當導航時,兩種導航模式根據機器人當下 影像以及影像序列中最接近參考影像之間尺度差異來做切換,以減少導航時間。
相較於其他研究以較慢(10 公分/秒)的速度下導航,本研究提出的方法可以在較 快的平均速度(26.5 公分/秒)下導航而不失其導航精準度,此外也減少了表示一 個路線所需要的參考影像張數。
2. 研究成果在學術期刊發表或申請專利等情形:
論文:□已發表 □未發表之文稿 █ 撰寫中 □無 專利:□已獲得 □申請中 □無
技轉:□已技轉 □洽談中 □無 其他: (以 100 字為限)
本成果已發表於 2010 SMC,題目為:Sequence-based Autonomous Robot Navigation by Log-Polar Image Transform and Epipolar Geometry,並且增加 內容撰寫的文稿,目前正投稿國際期刊,審核中。
附件二
9
3. 請依學術成就、技術創新、社會影響等方面,評估研究成果之學術或應用價 值(簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性)(以 500 字為限)
本研究的學術成就是提出基於序列的視覺導航系統。當一個機器人使用參考 影像序列進行導航時,當機器人在往航點靠近時會先以移動速度較快但精準 度較低的模式行進,而當接近目標點的時候則採用相反的模式行進。技術實 現的例證上,採用台科大連接大樓為例,機器人在 11 張參考影像以及 117.3 公尺的路徑下其平均速度為 26.5 公分每秒。與其他的方法相比,提出的方 法需要的參考影像更少而且可以有更快的導航速度。
在一般室內的機器人導航上,本研究所提出的方法,不需要完整的環境
塑模。對於用來當作搬移以傳送物品、停駐點為固定的機器人應用,本計劃
所提出的方法相對傳統,需要先建立地圖,再比對機器人定位,以行走下一
步的方法,有塑模較簡單以及導航較短的優勢。
10
國科會補助計畫衍生研發成果推廣資料表
日期:100 年 09 月 15 日
國科會補助計畫
計畫名稱:以有限張數的場景影像為依據作為覆蓋建築 物內全域動線的視覺導航系統
計畫主持人:鍾聖倫
計畫編號:NSC 99-2221-E-011-134 領域:機器人
研發成果名稱
(中文)以有限張數的場景影像為依據之視覺導航系統
(英文)Sequence-based Autonomous Robot
Navigation by Log-Polar Image Transform and Epipolar Geometry
成果歸屬機構 國立台灣科技大 學
發明人 (創作人)
鍾聖倫、項天瑞、傅 宇
技術說明
(中文)
本研究計畫針對自主機器人提出一個基於影像序列的視 覺導航系統。系統先取得表示著導航路線的影像序列。
當導航時,兩種導航模式根據機器人當下影像以及影像 序列中最接近參考影像之間尺度差異來做切換,以減少 導航時間。當尺度差異很大時,log-polar 轉換可以用來 找到兩張影像之間的對應,此時機器人以一種較快但較 不精準的導航模式來移動直到接近目標航點。當機器人 由於往航點移動而漸漸將當下影像和參考影像的尺度差 異減少後,一種較精確但慢速的導航模式將啟用。相較 於其他研究以較慢(10 公分/秒)的速度下導航,本論文提 出的方法可以在較快的平均速度(26.5 公分/秒)下導航而 不失其導航精準度,此外也減少了表示一個路線所需要 的參考影像張數。
附件三
11
(英文)
A sequence-based visual navigation system for autonomous robots is proposed in this paper. A sequence of reference images is taken to represent a transportation route. When a robot navigates in a route, two navigation modes, with better reduction in total transportation time required, are
switched according to the difference of scales between the robot’s current view and the targeted reference image. While other sequence based
navigation methods moves the robot at a maximal velocity of 10 cm/sec, the proposed method moves the robot in an average velocity of 26.5 cm/sec during a 117.3 m long route with 11 reference images.
Comparing to similar sequence-based navigation research, the proposed method doe not only reduce transportation time by at least half, but also require less number of reference images needed to guide the autonomous robots in route transportation applications.
產業別
機械、電機
技術/產品應用範圍
室內定點路線搬移用機器人
技術移轉可行性及預期 效益
在一般室內的機器人導航上,本研究所提出的方法,不 需要完整的環境塑模。對於用來當作搬移以傳送物品、
停駐點為固定的機器人應用,本計劃所提出的方法相對 傳統,需要先建立地圖,再比對機器人定位,以行走下 一步的方法,有塑模較簡單以及導航較短的優勢。
註:本項研發成果若尚未申請專利,請勿揭露可申請專利之主要內容。
12
國科會補助專題研究計畫項下出席國際學術會議心得報告
日期:100 年 09 月 15 日
中國控制與決策會議(CCDC)概況
中國控制與決策會議(CCDC),是由《控制與決策》編輯委員會聯合中國自動化學會應用專業委員會、
中國航空學會自動控制專業委員會等學術組織,仿美國 CDC 會議,於 1989 年創辦的大型年度學術會 議,至今已經連續舉辦了 22 屆。最近幾年,每年參會人數,包括中國與國外控制專業學者,都在 600 人左右。從 1991 年起,會議論文集由《控制與決策》編輯部編輯,東北大學出版社出版。最近幾年,
每年論文集收入的文章都在 1000 篇以上。2011 中國控制與決策會議將更加國際化。除原有中國國內 的學術組織外,IEEE 控制系統協會與 IEEE 工業電子協會均將對本次會議提供技術支援。2011CCDC 會議論文集英文論文由 ISTP 收錄以及進人 IEEE Xplore Data Base,並將被 EI 檢索。
參加會議經過
CCDC 是大陸所舉辦,類似美國的 CDC,是大陸同領域最具規模的研討會議。本人於 100 年 5/23 飛上 海,再轉飛到四川,綿陽機場。開會地點在位處綿陽市的西南科技大學。在三天的會議中,共有五個 keynotes,以及 11 個 semi-plenary speech,主講者分別來自美國,澳大利亞,利大利,以及新加波與中 國等,主要內容是控制的新理論、新視野、以及控制應用。在五個 keynote speech 裏,我覺得 Beijing University of Posts and Telecommunications 的 Deyi Li 講的 Fusion of Decision and Control: Scientific Research and Experiments in Intelligent Drive,總結控制系統的過去與未來,最有資訊性。另,Tsinghua
計畫編號 NSC 99-2221-E-011-134
計畫名稱 以有限張數的場景影像為依據覆蓋建築物內全域動線的視覺導航系統 出國人員
姓名 鍾聖倫 服務機構
及職稱 台灣科技大學電機系教授 會議時間 2011 年 5 月 23 日
至同年 5 月 25 日 會議地點 中國、四川、綿陽 會議名稱
中文: 2011 中國控制與決策會議(2011 CCDC)
英文: 2011 Chinese Control and Decision Conference (2011 CCDC)
發表論文 題目
(中文)在制冷模式下汽車空調的溫度與濕度控制
(英文) Temperature and humidity control for an automobile during heating period
附件四
13
University 的 Cheng Wu 所講的 The Status and Prospect of Information Technology in Chinese
industrialization 則是介紹控制在 IT 產業中所扮演的角色。大會另外安排的討論會 (semi-plenary)裏,
National University of Singapore 的 Ben M. Chen 講 Modeling and forecasting of stock markets under a system adaptation framework,從控制觀點面對股票市場不確定性,而要達到最大效益的目標進行演講,
非常精彩。我的報告是在 WeA04 的 Control Applications。報告順利,並與聽眾有多問答。
與會心得
本次 CCDC 大會的投稿有 1280 篇,分別來自 28 個國家與地區。大部份的時間,議程是以七組同時進 行。所有的 session 幾乎都是用英文報告,而且報告人的英文水準比我想像中的好。在我 control application 的 session 中,有聽到與國防、產業相關的控制系統報告。大陸控制學者之多以及研究題目 的廣度讓我印象深刻。另外,我比較吃驚的是,有若干個 keynote 以及 semi-plenary 實際來報告的不是 大會編列的大牌,而是由博士後或是研究團隊來報告,這在其他國家是不太可能發生的。大會每天晚 上都有晚宴,國內外的與會學者,齊聚一堂,交流學術思想,討論學術問題。此次參加 CCDC 過程中,
個人覺得最大的收獲是與國內外學者進行研究想法上的交流。報告現場聽眾的提問,可以讓我知道研 究同樣學者關心的技術瓶頸,而他們的建議也擴展我後續的研究目標。透過與會人士報告自己最新的 研究成果,並積極參與、交換研究心得,是我覺得最大的收獲。
Sequence-based Autonomous Robot Navigation by Log-Polar Image Transform and Epipolar Geometry
Yu Fu∗, Tien-Ruey Hsiang†, and Sheng-Luen Chung‡
∗Dept. of Electrical Engineering
National Taiwan University of Science and Technology, Taipei, Taiwan Email: [email protected]
†Dept. of Computer Science and Information Engineering National Taiwan University of Science and Technology, Taipei, Taiwan
Email: [email protected]
‡Dept. of Electrical Engineering
National Taiwan University of Science and Technology, Taipei, Taiwan Email: [email protected]
Abstract—A sequence-based visual navigation system for au- tonomous robots is proposed in this paper. A sequence of reference images is taken to represent a transportation route.
When a robot navigates in a route, two navigation modes, with better reduction in total transportation time required, are switched according to the difference of scales between the robot’s current view and the targeted reference image. In the first mode of Region-based matching method, log-polar transform is applied to find a less accurate moving direction when the difference of scales is large. When the difference of scales is close, the second mode of epipolar geometry with SIFT features is employed to direct the robot for a more accurate motion when closing into the waypoint where the targeted reference image is taken. The robot’s moving velocity is adjusted accordingly, thus reducing the overall transportation time. While other sequence- based navigation methods moves the robot at a maximal velocity of 10 cm/sec, the proposed method moves the robot in an average velocity of 26.5 cm/sec during a 117.3 m long route with 11 reference images. Comparing to similar sequence-based navigation research, the proposed method doe not only reduce transportation time by at least half, but also require less number of reference images needed to guide the autonomous robots in route transportation applications.
Index Terms—Visual Navigation, SIFT, and Log-Polar Trans- form
I. INTRODUCTION
A robot is capable of moving from one position to another assist human being. In order to make the robot move au- tonomously in a working environment, an representative model for the working environment and a navigation technique are required. Geometric map and topological map are two common models to represent the environments. Different navigation techniques are developed according to the provided map.
A navigation technique based on image sequence [1] [2]
[3] [4] with the topological map is studied in this paper. The navigation technique is inspired by human being and some animals, which store observed image sequences as cues for navigation without knowing the exact metric in the environ- ment. A robot is guided in a route to construct the image sequence for the working environment in the beginning, then
autonomously navigates with the built visual memory in the same route. In this paper, each image in the image sequence is named ”reference image,” and the position where the reference image is taken is called ”waypoint.”
A. Related Work
A lot of sequence-based navigation techniques have been published. Chen and Birchfield [1] track the corners from the previous reference image and take a new reference image when the number of tracked corners is reduced to 50% during the guidance. The robot navigates with a heading direction which is decided from the relative odometry and a difference between features and their correspondences. Lopez-Nicolas et al. [5]
design an epipolar geometry-based and homography-based navigation method which leads the robot from one waypoint to its neighbor waypoint in most indoor environments. Fu et al. [2] develop a multi-waypoint navigation technique based on epipolar geometry. Ido et al. [3] modify the sequence-based navigation method when a humanoid robot is used, especially for the blurred images due to the swing motion. Courbon et al. [4] and Lopez-Nicolas et al. [6] design the navigation method under the a limited Field Of View(FOV) and non- holonmoic constraint.
Few of the above researches are able to navigate the robot in a high velocity in reality. Chen and Birchfield [1] and Ido et al. [3] navigate the robot when a maximal velocity is 10 cm/sec. Lopez-Nicolas et al. [5] provide a successful navigation while 4 cm/sec is the maximal velocity .
B. Summary of the Proposed Navigation System
In this work, we start from our previous work [2] with the purpose to reduce the number of the reference images and the runtime of navigation. An image matching method based on log-polar transform [7] [8] is applied to construct the reference image under large difference of scale. While the robot approaches a targeted waypoint, the behaviors of navigation are divided into two modes: one moves the robot faster but less accurately with the image matching result by
using log-polar transform while the robot starts navigation, while the other moves the robot more accurately but slower with the SIFT matching result while the robot is close to the waypoint. The navigation system is tested on a two-wheeled robot with a conventional camera installed on the top of the robot to look forward. The experimental results show the robot spends half of runtime and moves in a higher velocity with the proposed navigation system within a same accuracy as compared with the previous work [2].
C. Contribution
The contributions of this paper are listed as follows:
1) The robot with the proposed navigation system is able to move in a velocity of 26.5 cm/sec, which is faster than other researches in indoor environments. Thus, the robot takes less time to reach a same destination with the proposed method.
2) The number of the required reference images is reduced to 32% with log-polar transform-based matching method as compared the one with SIFT matching method.
D. Paper Organization
The rest of this paper is organized as follows: Section II defines the problem; Section III explains the image matching method with log-polar transform and details the navigation system and the procedure to construct the reference images;
Section IV provides an experiment to evaluate the performance of the proposed navigation system; Section V concludes this paper.
II. PROBLEMDEFINITION
The major purpose of the proposed sequence-based navi- gation system is to reduce the number of reference images and the run time for navigation. Most visual homing methods design a slow moving velocity when the robot is close to a targeted position. Mariottini et al. [9] multiplies the squared norms of the difference between the position of features in the current view and their correspondences in the reference image by a control gain to decide the moving velocity. Lopez- Nicolas et al. [10] design a multi-phase visual homing method which first navigates the robot to the back of the waypoint with a faster and variable moving velocity, then moves the robot forward to the waypoint with a slower and constant moving velocity. Fu et al. [2] adopt a proportional controller to decide the moving velocity according to the average magnitude of vertical component of optical flows for matched features.
When these above visual homing methods are applied in a mutli-waypoint path, the total run time becomes longer while a path is represented with more waypoints.
A multi-waypoint path is composed of three kinds of waypoints - service waypoints, junction waypoints, and inter- mediate waypoints. The service waypoints stand for important positions where the robot provides service, such as a delivery point for a transportation robot or an information desk for a guide robot in a museum. The junction waypoints, like a crossroad or t-junction, indicate the junctions between two
Fig. 1. Waypoints in a building complex
paths in an environment. The intermediate waypoint is used between two waypoints when the distance between the two waypoints is too long for the robot to departure from one waypoint to the other directly with a visual navigation method.
The number and the position of waypoints are different by using different visual navigation method. Fig. 1 shows an example of the waypoints in a building complex.
The number of the intermediate waypoints is reduced in order to speed up the entire navigation procedure by designing a local visual homing method which is capable of navigating the robot from a farther starting position. A minimal number of matched features confines the set of robot’s starting position to a local region around the waypoint when applying the visual homing method, because a correct epipolar geometry or homography matrix is obtained to infer the moving direction and velocity while sufficient number of matched features are found in most feature-based local visual homing methods [2]
[11]. Fig. 2 shows an example when the difference of scale between two images is too large to calculate the correct epipolar geometry due to an insufficient number of matched features with Scale-Invariant Feature Transform (SIFT) [12].
The left and right images in Fig. 2 represent the robot’s initial view and the reference image. In this example, an intermediate reference image is required between the starting position and the targeted waypoint.
The proposed visual homing method intends to extend the applicable local region of the SIFT visual homing methods [10]
[2] in order to reduce the number of intermediate waypoints, since scale-invariant features, such as SIFT [12] or Speed Up Robust Feature (SURF) [13], are adopted in most feature-based visual homing method. In doing so, a matching method is required when the number of correctly matched SIFT features is insufficient because a large difference of scale exists between the robot’s view and the reference image. Besides that, a two-
Fig. 2. Matched SIFT features when the difference of scale between two images is too large
stage visual homing method is developed to move the robot to the applicable region of the SIFT visual methods, then to the targeted waypoint.
III. APPROACH
The proposed sequence-based navigation system is detailed in this section, including the image matching method of two images, the proposed local visual homing method, and the con- struction method of reference images for a route. The matching method which is based on log-polar transform [7] [8] is capable of finding corresponding regions in two images when the relative distance from the robot to the targeted waypoint is large. The matching method based on log-polar transform is adopted when a relative distance between the robot’s position and the waypoint is large, while SIFT matching method is used when the relative distance is moderate or small. According to these two matching methods under difference scale of difference, the proposed local visual homing method navigates the robot in two modes: one moves the robot faster with less accuracy when the robot starts navigation, and the other moves the robot more accurately but slower when the robot is close to the targeted waypoint. Given several reference images, the robot moves from one waypoint to a nonadjacent waypoint with the proposed visual homing method. The construction method of the reference images is explained at the end of this section.
A. Construction of Log-Polar Image and Matching Method between two Log-Polar Images
The log-polar transform is a representative model of the retina in the primary visual cortex for human and primates [7].
The density of photoreceptors around the fovea in eyes is high and becomes low from the fovea exponentially. The log- polar transform is widely utilized in computer vision since its biological plausibility and invariance of rotation and scale.
Zokai and Wolberg [8] propose an image stitching method with log-polar transform and reduce the computation with the image pyramid. Traver and Bernardino [14] survey the applications of log-polar transform in robotics.
The log-polar transform converts an image in Cartesian co- ordinate to an image in log-polar coordinate. Given a grayscale image I in Cartesian coordinate, each pixel (u, v) inside a circle with radius rM ax and center (uc, vc) is transformed to log-polar coordinate in Eq. (1). The detail implementation and
Fig. 3. Images in Cartesian coordinate and in log-polar coordinate
the used parameters, such as rM ax and the resolution of log- polar image can be found in [8]. Fig. 3 shows a 320×240 image in Cartesian coordinate and its log-polar image while the center (uc, vc) is (160, 120) and the radius rM ax is 80 pixels. The red rectangle is denoted as a corresponding pixel in both images.
r = logbasep
(u − uc)2+ (v − vc)2 θ = arctanv − vc
u − uc
(1) A major benefit of the log-polar transform is the invariance of rotation and scale. Given IRot and IScale as the rotated and shrunk images of the image I, the log-polar transform of IRotand IScaleare found respectively by shifting the log-polar image of I along the θ-axis circularly and along the r-axis, respectively.
The matching method between two images by using log- polar transform is shown in Algorithm 1. Given the center (uc,2, vc,2) in the second image I2and the radius rM ax, ILP,2
is first computed by log-polar transform. On the other hand, the log-polar image ILP,1 is calculated for each pixel in the first image I1. By shifting ILP,1 along the θ-axis and the r- axis, Normalized Cross Correlation (NCC) is calculated as a measure of similarity of the overlapped region between ILP,1
and ILP,2. Because a maximum of correlation value is searched in a four-dimensional space - (u, v, ∆r, ∆θ), this matching procedure takes too much time in real-time applications.
A improved matching method which reduces the processing time is developed with image pyramid [8]. The images I1
and I2 are downsampled to construct the image pyramid. The matching process starts from the coarsest level of the image pyramid to globally find a largest correlation value. In the next level, the largest correlation value is locally searched around the pixel where the largest correlation value is found in the previous level. This procedure repeats until the largest correlation value is found in the finest level. With this local searching strategy, the total amount of processing time for matching two 640×480 images is reduced to 20 seconds in [8].
The largest correlation value in a three-dimensional space - (u, v, ∆θ) is searched in the proposed navigation system in
Algorithm 1 Matching method between two images with log- polar transform
Require: grayscale images I1 and I2, the center (uc,2, vc,2) of log-polar transform in I2
1: ILP,2= LogP olarT ransf orm(I2, (uc,2, vc,2))
2: for all pixels (u, v) in I1 do
3: ILP,1= LogP olarT ransf orm(I1, (u, v))
4: for all shifts ∆r along r-axis and ∆θ along θ-axis do
5: ncc = N CC(ILP,1, ILP,2, ∆r, ∆θ)
6: if largest ncc then
7: Dif f erence of rotation = ∆θ
8: Dif f erence of scale = base∆r
9: Best matched center in I1= (u, v)
10: end if
11: end for
12: end for
Fig. 4. Image matching with log-polar transform
order to reduce the processing time. Since the robot moves on the ground with a fixed camera, no difference of rotation exists between the robot’s view and the reference image.
This simplified matching procedure takes 0.3 milliseconds to match images therefore serves the purpose for real-time robot navigation. An example of matching result is shown in Fig.
4. The same image pair as Fig. 2 is used to demonstrate the matching ability of SIFT matching method and log-polar transform.
B. Sequence-based Navigation System by LogPolar and Epipo- lar Geometry
Given a series of reference images which are captured on the waypoints in a route, the proposed navigation system moves the robot to pass each waypoint and to reach a desig- nated service waypoint finally. A local visual homing method is developed to navigate the robot from one waypoint to the next waypoint.
The proposed local visual homing method is divided into four phases: rotation to the direction of the next waypoint, fast motion with log-polar transform, accurate motion with epipolar geometry, and heading correction with optical flows for matched features. The robot’s moving trajectory for each phase is shown in Fig. 5.
1) Phase 1: rotation to the direction of the next waypoint:
The robot rotates to a direction which roughly points to the next waypoint in the first phase. This rotation is necessary after the robot reaches the junction waypoints, because the used
Fig. 5. Robot’s moving trajectory with the proposed local visual homing method
forward-looking camera has limited field of view and cannot face to a direction to the next waypoint. The rotation angle is recorded in advanced. The number of reference images on a waypoint depends on the number of linked edges on each waypoint. The relative angle between two edges is recorded in advanced. Therefore, these relative angles can be used as a direction to the next waypoint when the robot is on the junction waypoint.
2) Phase 2: fast motion with log-polar transform: The robot moves fast but less accurately in the second phase in order to reduce the difference of scale between the robot’s current view and the reference image in the second phase.
Because the number of matched SIFT may not enough to calculate the epipolar geometry when the difference of scale between two images is too large, the robot adopts a simple navigation strategy with the log-polar transform to infer a moving direction to reduce the distance between the robot and the waypoint.
Given a image matching result with log-polar transform between the robot’s current view and the reference image as (uCur, vCur) ↔ (uRef, vRef) and the difference of scale as
∆s, the angular velocity ω can be determined in a propor- tional controller with the horizontal component of translation difference, as shown in Eq. (2). Kω stands for a control gain.
The robot may not move in the direction which points to the waypoint directly with this calculated angular velocity, however, Chen [1] and Ido [3] have proved that the robot ultimately converges to the waypoint with this calculated angular velocity when the length of the route is long enough.
ω = −Kω× (uCur− uRef) (2) The moving velocity VLogP olar is decided in another pro- portional controller with ∆s, as shown in Eq. (3). KLogP olar, Θs, and BaseVLogP olar stand for a control gain, a threshold which is used to judge if the difference of scale is too large, and a minimal moving velocity. KLogP olar, Θs, and VBaseLogP olar
are set as 40, a number between 1.5 and 2.5, and 40 cm/sec, respectively.
VLogP olar = KLogP olar× (∆s − Θs) + VBaseLogP olar (3) While ∆s becomes moderate or small, the phase of the robot changes from phase 2 to phase 3 such that the robot
moves more accurately. The robot is aware of a moderate or small ∆s when ∆s ≤ Θs keeps continuously for several times during the navigation. With multiple checks on the ∆s, a early transition from phase 2 to phase 3 caused by occasional mismatch of log-polar images is preventable.
3) Phase 3: accurate motion with epipolar geometry: The robot moves accurately but slow in phase 3 in order to reach the waypoint. The robot moves along a steering angle which points directly to the waypoint. Since the distance between the robot and the waypoint is close enough to match SIFT features between the robot’s view and the reference image, the epipolar geometry between the two images is computed to decide the steering angle. The moving velocity is decided with the optical flows for matched SIFT which is defined in Eq. (4). The moving velocity is calculated in Eq. (5).
KEpipolarand VBaseEpipolar are set as 2 cm/sec and 1 cm/sec, respectively. The robot stops when the |OFv| becomes a small local minimum. Meanwhile, the robot reaches the waypoint The detailed description on phase 3 can be found in our previous work [2].
OFi= (OFu,i, OFv,i) = (uCuri − uRefi , vCuri − viRef), if (uCuri , vCuri ) ↔ (uRefi , viRef), i = 0, 1, 2, · · · (4)
VEpipolar= KEpipolar× |OFv| + VBaseEpipolar (5) 4) Phase 4: heading correction with optical flows for matched features: The heading direction is corrected by mini- mizing |OFu| after the robot reaches the waypoint. When two images are captured on a same position in a same heading direction, the feature and its correspondence locate at the exact same coordinate. In other word, the vertical component OFv
and the horizontal component OFu of the optical flow are both equal to zero. After OFv becomes a minimum in phase 3, the robot rotates to minimize |OFu| in phase 4. The rotating direction is determined according to the sign of OFu. If |OFu| is smaller than a threshold ΘOF u, the rotation stops. The robot is now on the waypoint with a same heading direction where the reference image was taken.
Phase 4 can be skipped when the waypoint is an interme- diate waypoint in order to speed up the navigation. Because the difference between the direction to the next waypoint and the robot’s current direction is small, the robot skips phase 2 and begins the local visual homing with the next reference image. This skip reduces the runtime of navigation and makes the robot pass each intermediate waypoint slowly instead of stopping.
C. Construction of Reference images
Given the candidates of the reference images that are captured during guidance, the candidates that are taken on the service waypoints and junction waypoints are first selected as the reference images. The remained candidates are matched to the certain reference images to generate the rest of the
TABLE I
THE LIST OF THE CONTROL GAINS AND PARAMETERS IN THE PROPOSED NAVIGATION SYSTEM
Parameters Value Parameters Value
uc,2 160(pixel) vc,2 120(pixel)
rM ax 80(pixel)
Kω 0.2(deg/sec) KLogP olar 4(cm/sec)
VBaseLogP olar 40(cm/sec) Θs 2
KEpiolar 2(cm/sec) VBaseEpipolar 1(cm/sec)
ΘOF u 5(pixel) ΘOF v 4(pixel)
reference images and to decide the location of the intermediate waypoints.
An intermediate waypoint is decided when the distance from the robot to the targeted waypoint is too large to match the current view to the reference image successfully. The distance depends on the adopted matching method. The matching result of scale-invariant features, such as SIFT [12] and SURF [13], is correct under a moderate distance. The matching result of SIFT fails while the distance goes large, as shown in Fig. 2, while the matching result of log-polar transform still succeeds, as shown in Fig. 4. With the matching method of log-polar transform, the number of required reference images is less than the one constructed by the SIFT matching method.
Fig. 6 shows the procedure to construct a sequence of reference images in a route from wayoint A to waypoint B. In the beginning, the robot is guided through the route and takes the candidates of the reference images. The images marked with dotted lines and with solid lines are the candidates and reference images, respectively. Two candidates on the service waypoint A and junction waypoint B are first selected as the reference images in the beginning, as shown in Fig. 6(a).
Fig. 6(b) shows the matching result with log-polar transform between each candidate and the certain reference image on waypoint B. The corresponding circle becomes smaller as the relative distance between the robot and waypoint B becomes larger until a failed matching result is found, which is marked as cross. The candidate which leads the failed matching result is set as a reference image and the position where the candidate is taken becomes an intermediate waypoint. Fig. 6(c) shown the final result of the reference images and the waypoints.
IV. PERFORMANCEEVALUATION
An experiment is conducted with the proposed method and the method in [2] in order to show the improvement by introducing the log-polar transform. The route starts from the service waypont A, passes through junction waypoint B, and reaches service waypoint C in the experiment which is shown in Fig. 1. The total length of the route is 117.3 m. The control gains and parameters used in the proposed navigation system are shown in Table I.
The robot spends 442 seconds completing the route while the navigation error is bounded in 15 cm. Seven reference images on the intermediate waypoints are built while four nec- essary reference images are captured on the service waypoints