Xav tau ib lub tshuab pab suab me me uas ua raws li koj tus qauv, khiav ntawm koj tus kheej lub kho vajtse, thiab yuav tsis yuam kev xaj kaum ob lub txiv laum huab xeeb vim nws hnov koj tsis raug? Ib lub tshuab pab AI DIY nrog Raspberry Pi yog qhov ua tau zoo kawg nkaus, lom zem, thiab yoog tau. Koj yuav txuas ib lo lus ceeb toom, kev paub lus (ASR = kev paub lus tsis siv neeg), lub hlwb rau cov lus ntuj (cov cai lossis LLM), thiab cov ntawv nyeem-rau-hais lus (TTS). Ntxiv ob peb tsab ntawv sau, ib lossis ob qhov kev pabcuam, thiab qee qhov kev kho suab zoo, thiab koj tau txais lub tshuab hais lus ntse uas nqa tau yooj yim uas ua raws li koj cov cai.
Cia peb coj koj los ntawm xoom mus rau kev tham nrog koj tus Pi yam tsis tas rub koj lub taub hau li niaj zaus. Peb yuav tham txog cov khoom, kev teeb tsa, code, kev sib piv, gotchas ... tag nrho burrito. 🌯
Cov ntawv uas koj yuav nyiam nyeem tom qab qhov no:
🔗 Yuav ua li cas kawm AI kom zoo
Tsim ib daim ntawv qhia kev kawm, xyaum ua tej yaam num, thiab taug qab kev nce qib.
🔗 Yuav pib lub tuam txhab AI li cas
Txheeb xyuas qhov teeb meem, tsim MVP, sib sau ua ke pab neeg, nrhiav cov neeg siv khoom thawj zaug.
🔗 Yuav siv AI li cas kom ua tau zoo dua
Ua kom cov haujlwm niaj hnub ua haujlwm tau yooj yim, ua kom cov txheej txheem ua haujlwm yooj yim dua, thiab nce cov txiaj ntsig muaj tswv yim.
🔗 Yuav ua li cas koom nrog AI rau hauv koj lub lag luam
Txheeb xyuas cov txheej txheem muaj feem cuam tshuam loj, siv cov qauv sim, ntsuas ROI, thiab nthuav dav.
Dab tsi ua rau tus pabcuam AI DIY zoo nrog Raspberry Pi ✅
-
Yog koj tsis qhia rau lwm tus paub ces koj yuav tsum tso suab rau hauv koj lub xov tooj yog tias ua tau. Koj txiav txim siab seb yuav tso dab tsi tawm ntawm koj lub xov tooj.
-
Modular - sib pauv cov khoom xws li Lego: wake word engine, ASR, LLM, TTS.
-
Pheej yig - feem ntau yog qhib qhov chaw, cov khoom siv mics, cov neeg hais lus, thiab Pi.
-
Hackable - xav tau kev ua haujlwm hauv tsev, dashboards, kev ua ub ua no, kev txawj tshwj xeeb? Yooj yim.
-
Txhim khu kev qha - kev pabcuam-tswj, khau raj thiab pib mloog tau yam tsis tas siv neeg.
-
Kev Lom Zem - koj yuav kawm ntau yam txog lub suab, cov txheej txheem, thiab kev tsim qauv uas tsav los ntawm qhov xwm txheej.
Lub tswv yim me me: Yog tias koj siv Raspberry Pi 5 thiab npaj yuav khiav cov qauv hauv zos hnyav dua, lub tshuab cua txias uas txuas rau yuav pab tau thaum lub sijhawm thauj khoom ntev. (Thaum tsis paub meej, xaiv lub tshuab cua txias uas tsim los rau Pi 5.) [1]
Cov Khoom Siv & Cov Cuab Yeej Koj Yuav Tsum Tau 🧰
-
Raspberry Pi : Pom zoo kom siv Pi 4 lossis Pi 5 rau qhov chaw siab.
-
Daim npav microSD : 32 GB+ pom zoo.
-
USB microphone : lub microphone sib tham USB yooj yim zoo heev.
-
Lub hais lus : USB lossis 3.5 hli lub hais lus, lossis I2S amp HAT.
-
Network : Ethernet los yog Wi-nkaus.
-
Cov yam zoo xaiv tau: lub thawv, lub tshuab cua txias rau Pi 5, khawm nias rau kev nias-rau-hais lus, lub nplhaib LED. [1]
Kev Teeb Tsa OS & Kab Pib
-
Flash Raspberry Pi OS nrog Raspberry Pi Imager. Nws yog txoj hauv kev yooj yim kom tau txais microSD bootable nrog cov presets koj xav tau. [1]
-
Khau raj, txuas rau network, tom qab ntawd hloov kho cov pob khoom:
sudo apt hloov tshiab && sudo apt hloov kho -y
-
Cov Ntsiab Lus Tseem Ceeb Txog Suab : Ntawm Raspberry Pi OS koj tuaj yeem teeb tsa cov zis tawm, cov theem, thiab cov khoom siv ntawm lub desktop UI lossis
raspi-config. USB thiab HDMI suab tau txais kev txhawb nqa thoob plaws cov qauv; Bluetooth tso zis muaj nyob rau ntawm cov qauv nrog Bluetooth. [1] -
Xyuas cov khoom siv:
arecord -l aplay -l
Tom qab ntawd sim ntes thiab ua si dua. Yog tias cov theem zoo li txawv txawv, xyuas cov mixers thiab defaults ua ntej liam lub mic.

Kev Tsim Kho Vaj Tse Hauv Ib Lub Ntsiab Lus 🗺️
Ib tug DIY AI Assistant uas muaj peev xwm ua tau zoo nrog Raspberry Pi flow zoo li no:
Lo lus tsa suab → kev ntes suab nyob → kev sau ntawv ASR → kev tswj hwm lub hom phiaj lossis LLM → cov lus teb → TTS → kev ua si suab → kev xaiv ua los ntawm MQTT lossis HTTP.
-
Lo lus ceeb toom : Porcupine me me, raug, thiab khiav hauv zos nrog kev tswj hwm kev nkag siab ntawm txhua lo lus tseem ceeb. [2]
-
ASR : Whisper yog ib hom ASR uas siv tau ntau hom lus, siv tau rau ntau yam kev kawm uas tau kawm txog ~680k teev; nws ruaj khov rau cov suab nrov/suab nrov tom qab. Rau kev siv ntawm lub cuab yeej,
whisper.cppmuab txoj kev xav C/C++ uas yooj yim to taub. [3][4] -
Lub Hlwb : Koj qhov kev xaiv - huab LLM ntawm API, lub cav cai, lossis kev xaus hauv zos nyob ntawm lub zog.
-
TTS : Piper tsim cov lus hais lus ntuj tsim hauv zos, ceev txaus rau cov lus teb snappy ntawm cov khoom siv me me. [5]
Cov Lus Sib Piv Sai 🔎
| Cov cuab yeej | Zoo Tshaj Plaws Rau | Zoo li tus nqi | Vim Li Cas Nws Ua Haujlwm |
|---|---|---|---|
| Lo Lus Porcupine Wake | Lub cuab yeej mloog tas li | Qib pub dawb + | CPU qis, raug, yooj yim khi [2] |
| Whisper.cpp | ASR hauv zos ntawm Pi | Qhib qhov chaw | Kev raug zoo, CPU-phooj ywg [4] |
| Sai dua-Whisper | ASR sai dua ntawm CPU / GPU | Qhib qhov chaw | Kev ua kom zoo dua ntawm CTranslate2 |
| Piper TTS | Cov lus hais tawm hauv zos | Qhib qhov chaw | Cov suab sai, ntau hom lus [5] |
| API ntawm Cloud LLM | Kev xav nplua nuj | Raws li kev siv | Tshem tawm cov khoom siv hnyav |
| Node-LIAB | Kev npaj ua haujlwm | Qhib qhov chaw | Cov duab ntws, MQTT tus phooj ywg |
Kauj Ruam Tsim: Koj Lub Suab Thawj Zaug 🧩
Peb yuav siv Porcupine rau lo lus sawv, Whisper rau kev sau ntawv, lub luag haujlwm "lub hlwb" yooj yim rau kev teb (hloov nrog koj qhov LLM xaiv), thiab Piper rau kev hais lus. Khaws nws kom tsawg, tom qab ntawd rov ua dua.
1) Nruab cov kev vam khom
sudo apt nruab -y python3-pip portaudio19-dev sox ffmpeg pip3 nruab sounddevice numpy
-
Porcupine: mus nrhiav SDK/bindings rau koj hom lus thiab ua raws li qhov pib ceev (nkag mus rau tus yuam sij + daim ntawv teev cov lus tseem ceeb + cov thav duab suab →
.process). [2] -
Whisper (CPU-phooj ywg): tsim whisper.cpp :
git clone https://github.com/ggml-org/whisper.cpp cd whisper.cpp && cmake -B tsim && cmake --build tsim -j ./models/download-ggml-model.sh base.en ./build/bin/whisper-cli -m ./models/ggml-base.en.bin -f your.wav -otxt
Cov saum toj no zoo ib yam li qhov project pib sai sai. [4]
Nyiam Python?
faster-whisper(CTranslate2) feem ntau snappier dua vanilla Python ntawm cov CPUs me me.
2) Teeb tsa Piper TTS
git clone https://github.com/rhasspy/piper cd piper ua # Rub tawm tus qauv suab uas koj nyiam, piv txwv li, en_US-amy echo "Nyob zoo." | ./piper --model voices/en/en_US-amy-medium.onnx --output_file hello.wav aplay hello.wav
Piper yog tsim los rau ntawm-device TTS nrog ntau lub suab / lus xaiv. [5]
3) Ib lub voj voog pabcuam tsawg kawg nkaus hauv Python
Ua kom me me: tos ib kab lus sawv (stub), sau cov ntaub ntawv, sau nrog whisper.cpp , tsim ib qho lus teb (qhov chaw), tom qab ntawd hais lus ntawm Piper. Hloov qhov chaw nrog koj nyiam LLM lossis txoj cai logic.
import os, subprocess, nthwv dej import sounddevice li sd WAKE_WORD = "hey computer" # pauv rau Porcupine hauv kev tsim khoom [2] RECORD_SECONDS = 6 SAMPLE_RATE = 16000 CHANNELS = 1 WORKDIR = "/home/pi/assistant" ASR_BIN = "/home/pi/whisper.cpp/build/bin/whisper-cli" # [4] ASR_MODEL = "/home/pi/whisper.cpp/models/ggml-base.en.bin" PIPER_BIN = "/home/pi/piper/build/piper" # [5] PIPER_VOICE = "/home/pi/piper/voices/en/en_US-amy-medium.onnx" os.makedirs(WORKDIR, exist_ok=Muaj tseeb) def record_wav(path, seconds=RECORD_SECONDS): audio = sd.rec(int(vib ob * SAMPLE_RATE), samplerate=SAMPLE_RATE, channels=CHANNELS, dtype='int16') sd.wait() nrog wave.open(path, 'wb') li w: w.setnchannels(CHANNELS); w.setsampwidth(2); w.setframerate(SAMPLE_RATE) w.writeframes(audio.tobytes()) def transcribe(path): cmd = [ASR_BIN, "-m", ASR_MODEL, "-f", path, "-otxt"] subprocess.run(cmd, check=True, cwd=WORKDIR) nrog open(path.replace(".wav", ".txt"), "r", encoding="utf-8") li f: rov qab f.read().strip() def generate_reply(prompt): yog tias "huab cua" hauv prompt.lower(): rov qab "Kuv tsis pom cov huab, tab sis nws yuav zoo. Nqa ib lub tsho tiv no yog tias muaj xwm txheej." rov qab "Koj hais tias: " + prompt def speak(text): proc = subprocess.Popen([PIPER_BIN, "--model", PIPER_VOICE, "--output_file", f"{WORKDIR}/reply.wav"], stdin=subprocess.PIPE) proc.stdin.write(text.encode("utf-8")); proc.stdin.close(); proc.wait() subprocess.run(["aplay", f"{WORKDIR}/reply.wav"], check=Muaj tseeb) print("Tus pab npaj txhij lawm. Ntaus cov kab lus sawv los sim.") thaum Muaj tseeb: typed = input("> ").strip().lower() yog tias typed == WAKE_WORD: wav_path = f"{WORKDIR}/input.wav" record_wav(wav_path) text = transcribe(wav_path) reply = generate_reply(text) print("Tus Neeg Siv:", text); print("Tus Pabcuam:", teb) hais lus(teb) lwm yam: print("Sau cov kab lus sawv los sim lub voj voog.")
Rau kev nrhiav lo lus ceeb toom tiag tiag, siv Porcupine's streaming detector (CPU qis, rhiab heev rau txhua lo lus tseem ceeb). [2]
Kev Kho Suab Uas Tseem Ceeb Tiag 🎚️
Ob peb qhov kev kho me me ua rau koj tus pabcuam xav tias ntse dua 10 npaug:
-
Qhov nrug ntawm lub microphone : 30–60 cm yog qhov chaw zoo rau ntau lub USB mics.
-
Cov Qib : tsis txhob txiav cov ntaub ntawv nkag thiab ua kom rov ua si zoo; kho qhov kev taw qhia ua ntej caum cov lej zais. Ntawm Raspberry Pi OS, koj tuaj yeem tswj cov khoom siv tso zis thiab cov qib ntawm cov cuab yeej system lossis
raspi-config. [1] -
Chav suab nrov : phab ntsa tawv ua rau muaj suab rov qab; daim lev mos mos hauv qab lub microphone pab tau.
-
Qhov txwv ntawm lo lus uas ua rau sawv : rhiab heev dhau → ua rau dab ua rau muaj kev xav; nruj dhau → koj yuav qw rau yas. Porcupine cia koj hloov kho qhov rhiab heev rau txhua lo lus tseem ceeb. [2]
-
Thermals : cov ntawv sau ntev ntawm Pi 5 tau txais txiaj ntsig los ntawm lub tshuab cua txias rau kev ua haujlwm tas mus li. [1]
Hloov Los Ntawm Cov Khoom Ua Si Mus Rau Cov Khoom Siv: Cov Kev Pabcuam, Pib Tsis Siv Neeg, Kev Kuaj Mob 🧯
Tib neeg tsis nco qab khiav cov ntawv sau. Cov khoos phis tawj tsis nco qab ua neeg zoo. Tig koj lub voj voog mus rau hauv kev pabcuam tswj hwm:
-
Tsim ib chav systemd:
[Chav] Kev Piav Qhia = DIY Lub Suab Pabcuam Tom qab = network.target sound.target [Kev Pabcuam] Tus Neeg Siv = pi WorkingDirectory = / tsev / pi / tus pabcuam ExecStart = / usr / bin / python3 / tsev / pi / tus pabcuam / tus pabcuam.py Rov pib dua = ib txwm RestartSec = 3 [Nruab] Xav Tau Los Ntawm = ntau tus neeg siv.target
-
Pab kom nws ua haujlwm:
sudo cp assistant.service /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable --now assistant.service
-
Cov log tsheb:
journalctl -u tus pabcuam -f
Tam sim no nws pib thaum nws khau raj, rov pib dua thaum nws tsoo, thiab feem ntau nws ua haujlwm zoo li lub tshuab siv hluav taws xob. Nws tsis lom zem me ntsis, tab sis zoo dua ntau heev.
Kev Txawj Ntse: Ua Kom Nws Muaj Peev Xwm Zoo Hauv Tsev 🏠✨
Thaum lub suab nkag thiab lub suab tawm ruaj khov lawm, ntxiv cov haujlwm:
-
Lub hom phiaj router : cov kev taw qhia yooj yim rau cov haujlwm niaj hnub.
-
Tsev ntse : tshaj tawm cov xwm txheej rau MQTT lossis hu rau Home Assistant's HTTP endpoints.
-
Plugins : cov haujlwm Python ceev xws li
set_timer,what_is_the_time,play_radio,run_scene.
Txawm tias muaj huab LLM nyob rau hauv lub voj voog, xa cov lus txib hauv zos pom tseeb ua ntej rau kev ceev thiab kev ntseeg tau.
Tsuas Yog Hauv Zos Xwb vs Cloud Assist: Kev Sib Hloov Koj Yuav Xav Li Cas 🌓
Tsuas yog hauv zos xwb
Qhov zoo: tus kheej, offline, tus nqi kwv yees tau.
Qhov tsis zoo: cov qauv hnyav dua yuav qeeb ntawm cov laug cam me me. Whisper txoj kev cob qhia ntau hom lus pab nrog kev ruaj khov yog tias koj khaws nws rau ntawm lub cuab yeej lossis ntawm lub server ze. [3]
Kev pab cuam huab zoo
: kev xav muaj zog, cov qhov rai loj dua.
Qhov tsis zoo: cov ntaub ntawv tawm ntawm lub cuab yeej, kev vam khom network, cov nqi hloov pauv.
Ib qho kev sib xyaw feem ntau yeej: wake word + ASR local → hu rau API rau kev xav → TTS local. [2][3][5]
Kev daws teeb meem: Gremlins txawv txawv & Kev kho sai 👾
-
Cov lus ceeb toom cuav ua rau muaj kev cuam tshuam : txo qhov rhiab heev lossis sim lwm lub microphone. [2]
-
ASR lag : siv tus qauv Whisper me dua lossis tsim
whisper.cppnrog cov chij tso tawm (-j --config Release). [4] -
Choppy TTS : ua ntej tsim cov kab lus nquag; paub meej tias koj lub cuab yeej suab thiab cov qauv piv txwv.
-
Tsis pom lub microphone : xyuas seb
puas muaj cov ntaub ntawv -lthiab cov mixers. -
Kev tswj cua sov : siv lub Active Cooler official ntawm Pi 5 rau kev ua haujlwm ruaj khov. [1]
Cov Lus Qhia Txog Kev Ruaj Ntseg & Kev Ceev Ntiag Tug Uas Koj Yuav Tsum Nyeem 🔒
-
Khaws koj lub Pi hloov kho tshiab nrog APT.
-
Yog koj siv ib qho API huab, sau cov ntaub ntawv koj xa thiab xav txog kev kho cov ntaub ntawv ntiag tug hauv zos ua ntej.
-
Khiav cov kev pabcuam nrog tsawg kawg nkaus txoj cai; tsis txhob siv
sudohauv ExecStart tshwj tsis yog tias xav tau. -
Muab ib hom hauv zos xwb rau cov qhua lossis cov sijhawm ntsiag to.
Tsim Cov Qauv Sib Txawv: Sib Xyaws Thiab Sib phim Zoo Li Ib Lub Sandwich 🥪
-
Ultra-local : Porcupine + whisper.cpp + Piper + cov cai yooj yim. Ntiag tug thiab ruaj khov. [2][4][5]
-
Kev pab huab ceev : Porcupine + (Whisper hauv zos me dua lossis huab ASR) + TTS hauv zos + huab LLM.
-
Lub hauv paus tswj kev ua haujlwm hauv tsev : Ntxiv Node-RED lossis Home Assistant rau cov kev ua ub ua no, cov xwm txheej, thiab cov sensors.
Piv txwv li Kev Txawj: Teeb Ci Los Ntawm MQTT 💡
import paho.mqtt.client li mqtt MQTT_HOST = "192.168.1.10" LUB NCAUJ = "tsev/chav nyob/teeb/teeb" def set_light(state: str): client = mqtt.Client() client.connect(MQTT_HOST, 1883, 60) payload = "ON" yog tias state.lower().startswith("on") lwm yam "OFF" client.publish(TOPIC, payload, qos = 1, khaws = Cuav) client.disconnect() # yog tias "tig rau lub teeb" hauv cov ntawv nyeem: set_light("on")
Ntxiv ib kab lus zoo li: "tig lub teeb chav nyob," thiab koj yuav xav tias zoo li ib tug kws ua khawv koob.
Vim li cas qhov Stack no ua haujlwm hauv kev xyaum 🧪
-
Porcupine ua haujlwm tau zoo thiab raug ntawm kev ntes cov lus ceeb toom ntawm cov laug cam me me, uas ua rau kev mloog tas li ua tau. [2]
-
Whisper qhov kev cob qhia loj, ntau hom lus ua rau nws ruaj khov rau ntau qhov chaw thiab cov lus hais sib txawv. [3]
-
whisper.cppua kom lub zog ntawd siv tau rau ntawm CPU-xwb li Pi. [4] -
Piper ua kom cov lus teb sai sai yam tsis xa cov suab mus rau huab TTS. [5]
Ntev dhau lawm, tsis tau nyeem
Tsim ib lub modular, tus kheej DIY AI Assistant nrog Raspberry Pi los ntawm kev sib txuas Porcupine rau wake word, Whisper (ntawm whisper.cpp ) rau ASR, koj xaiv lub hlwb rau cov lus teb, thiab Piper rau hauv zos TTS. Qhwv nws ua ib qho kev pabcuam systemd, kho suab, thiab xaim hauv MQTT lossis HTTP kev ua. Nws pheej yig dua li koj xav, thiab txawv txawv zoo siab nyob nrog. [1][2][3][4][5]
Cov ntaub ntawv siv los ua piv txwv
-
Raspberry Pi Software & Cooling - Raspberry Pi Imager (download & siv) thiab Pi 5 Active Cooler cov ntaub ntawv khoom
-
Raspberry Pi Imager: nyeem ntxiv
-
Lub Tshuab Txias Txias (Pi 5): nyeem ntxiv
-
-
Porcupine Wake Word - SDK & pib sai (cov lus tseem ceeb, kev nkag siab, kev xaus lus hauv zos)
-
Whisper (ASR qauv) - Ntau hom lus, ASR muaj zog tau kawm txog ~ 680k teev
-
Radford et al., Kev Paub Txog Kev Hais Lus Zoo Los Ntawm Kev Saib Xyuas Tsis Zoo Loj (Whisper): nyeem ntxiv
-
-
whisper.cpp - CPU-friendly Whisper inference nrog CLI thiab cov kauj ruam tsim
-
Piper TTS - Kev ceev ceev, hauv zos neural TTS nrog ntau lub suab / lus