特化復健醫療專用的深度視覺AI模型

開始前，先聊聊我們為什麼要探討這件事。

起因的需求其實相當清楚，在龍骨王核心產品之一步態分析，利用深度視覺來分析與評估患者(使用者)在連續的動態行為上，從肢體的綜整資訊上，在遵循復健醫學的準則下，能夠看出更多肉眼未能及時感知到的資訊，並串接更多患者生理資訊，進一步作分析與預測。

在定義適合的患者(使用者)時，目前在亞健康/急性後期的患者使用情境都臻於完善。但當我們想要在更進一步的擴大它的適用族群在更往急性期的患者時，卻遇到了困難。

原有的訓練工具HappyGoGo在中風的急性期患者也能使用的前提是中風後第五天能夠起身坐在輪椅上，接受SOP化的訓練模式。不過步態分析看重的是動態活動行為。想要將適用的患者光譜往前拉到急性期患者，首先就要解決倚重程度不一的輔具，拐杖或是助行器，或是更大型的動態懸吊系統或是外骨骼機器人都是現有的輔助模式。而這些輔具恰恰都會影響到深度視覺的判斷。

由於現有的深度視覺是通用於一般人生活模式的偵測，並沒有特別針對醫療相關情境訓練讓機器看得懂、還要看的精確。為此，我們需要重新訓練該深度視覺模型。

希望不遠的將來，不管患者因應訓練介入所需的輔具或是訓練設備的不同，深度視覺都能夠精確地看出患者在整個訓練的動態過程中各肢體協調的變化，來更精確的評估出其能力現況與進步幅度。

最後來一點乾貨，目前可行的幾種視覺訓練環境建構有以下幾種，也分別敘述其優缺點

1.Windows with RTX3080

我想一部分的我還是相當技術控，對於運算所需要的設備還是相當講究，配有RTX3080超高運算效能筆電一直是夢幻逸品，由於是windows系統有很多原生的優點與限制。

比方說開發與設定的環境還是較為直覺與熟悉，由於使用族群人數還是多，訓練的模組使用過程中遇到的困難網路上相關的經驗分享或是解答都多上許多。

單顆GPU無法作太多的擴充，用來檢驗與常識訓練模組的初步成效是相當適合的。

2.Ubuntu in Nvidia Jetson AGX Xavier

這是Nvidia推出的較為高階的消費級運算工具，比起他更廣為人知的小老弟Jetson nano, AGX系列算是高階的邊緣運算工具。嘗試了一下，Ubuntu, 以及Jetson系列還是會有一些毛毛雜雜的事情發生，像是安裝的套件版本問題，有些必備的影像處理基礎問題-像是Jetson 的opencv有特殊版本，可能是為了作為硬體加速用。再前往訓練出自己適合的AI模型上，選擇這條路要處理的雜是相對是最多。

雖然我們已經有了Jetson AGX Xavier的平行運算主機群了XD

3.Colab of Google

這是目前最完美的選項，真的需要大量擴張，最高的方案，500個運算單位只需要49美金。環境設定上也還算單純，多顆算力強大的GPU一起來為你運算。有了這個工作環境，要定期添加不同認證過的復健硬體讓系統可以分辨這簡直不是夢事。就他能做到的目標以及因為便利所減少的錯誤嘗試成本，比之價格，這真的是相當優惠。

整個系統的使用上也相當直覺容易上手。

The origins of the demand are quite clear. In step analysis of the core product of Longgu Wang, using depth vision to analyze and evaluate patients (users) in continuous dynamic behavior, from the overall information of the limbs, and in accordance with the principles of rehabilitation medicine, more information can be seen that is not perceived by the naked eye in a timely manner, and more patient physiological information is linked for further analysis and prediction.

When defining suitable patients (users), the current use scenarios for sub-healthy/acute-stage patients are quite perfect. But when we want to further expand its applicability to more acute-stage patients, difficulties are encountered.

The existing training tool HappyGoGo can also be used by acute-stage stroke patients on the premise that they can sit in a wheelchair and receive SOP-style training on the fifth day after stroke. However, step analysis focuses on dynamic activity behavior. To pull the spectrum of applicable patients forward to acute-stage patients, the first thing to solve is the problem of the different degrees of dependence on devices such as crutches or walkers, or larger dynamic suspension systems or exoskeleton robots, which are all existing assistance modes. These devices will affect the judgments of depth vision.

Due to the existing depth-vision is general for the detection of general people’s life patterns, there is no special training for medical-related scenarios that allows machines to understand and see accurately. Therefore, we need to retrain the deep vision model.

In the future, we hope that regardless of the different devices or training equipment required for patient training intervention, deep vision will be able to accurately see the changes in coordination of each limb of the patient during the entire training dynamic process, to more accurately evaluate their ability status and progress.

Lastly, some dry goods, currently several feasible visual training environment constructions are as follows, and their advantages and disadvantages are also described. It’s important to notice that context is missing in order to understand those different visual training environments and the corresponding pros and cons.

1.Windows with RTX 3080:

I think that I am still quite technology-savvy, and I am quite particular about the equipment required for computation. A laptop with RTX 3080, which has extremely high computational performance, has always been a dream product. Since it is a Windows system, there are many native advantages and limitations.

For example, the development and configuration environment is more intuitive and familiar. Since the number of users is still large, there are many experience sharing or solutions on the Internet for difficulties encountered in the training module usage process.

A single GPU cannot do too much expansion, it is suitable for initial effectiveness testing and common sense training modules.

2.Ubuntu on Nvidia Jetson AGX Xavier:

This is a higher-end consumer-grade computational tool introduced by Nvidia. Compared to its more widely known little brother, the Jetson nano, the AGX series is a high-end edge computing tool. I tried it and Ubuntu, as well as the Jetson series, still have some messy things happen, such as package version problems, some essential image processing issues – like Jetson’s opencv has a special version, it may be for hardware acceleration purposes. When going on to train your own AI model, choosing this path is relatively the most to deal with.

Although we already have a parallel computing host group of Jetson AGX Xavier.

3.Google Colab:

This is currently the most perfect option, really needs a lot of expansion, the highest plan, 500 computational units only need 49 US dollars. The environment configuration is also relatively simple, with multiple powerful GPUs working together to do the computation for you. With this working environment, it is necessary to regularly add different certified rehabilitation hardware to make the system distinguishable, this is not a dream. Its achievable goals and the reduction of error trial costs due to convenience, compared to its price, this is truly very favorable.

The entire system is also quite intuitive and easy to use.

特化復健醫療專用的深度視覺AI模型

發表留言取消回覆

● About The author

● Recent Articles

龍骨王獲台北市產發局邀請獎助參與 2025Techsauce Global Summit

國家衛生研究院展區：把復健「攜帶化」　平板AI與雲端處方讓遠距訓練可量可管

經濟部產業發展署展區：AR 引導的 4 公尺步態——把「臨床友善」做進產品DNA

特化復健醫療專用的深度視覺AI模型

分享此文：

發表留言 取消回覆

● About The author

● Recent Articles

龍骨王獲台北市產發局邀請獎助參與 2025Techsauce Global Summit

國家衛生研究院展區：把復健「攜帶化」 平板AI與雲端處方讓遠距訓練可量可管

經濟部產業發展署展區：AR 引導的 4 公尺步態——把「臨床友善」做進產品DNA

發表留言取消回覆

國家衛生研究院展區：把復健「攜帶化」　平板AI與雲端處方讓遠距訓練可量可管