Pythonãã£ãããããéçºã«ãããæå³èªèã®éèŠæ§ãæ¢ãå æ¬ã¬ã€ããã°ããŒãã«å¯Ÿå¿ã®äŒè©±åAIãæ§ç¯ããããã®æè¡ãããŒã«ããã¹ããã©ã¯ãã£ã¹ã解説ããŸãã
Pythonãã£ãããããéçºïŒã°ããŒãã«ã¢ããªã±ãŒã·ã§ã³åãæå³èªèã·ã¹ãã ã®ç¿åŸ
æ¥éã«é²åãã人工ç¥èœã®åéã«ãããŠãäŒè©±åAIã¯é©æ°çãªæè¡ãšããŠå°é ããŠããŸãããé«åºŠãªèªç¶èšèªçè§£ïŒNLUïŒæ©èœãæèŒãããã£ãããããã¯ããã®é©åœã®æåç·ã«ç«ã£ãŠããŸãã广çã§é åçãªäŒè©±åãšãŒãžã§ã³ããæ§ç¯ããããšããéçºè ã«ãšã£ãŠãæå³èªèãç¿åŸããããšã¯æãéèŠã§ãããã®ã¬ã€ãã§ã¯ãPythonãã£ãããããéçºã«ãããæå³èªèã·ã¹ãã ã®è€éããæ·±ãæãäžããã°ããŒãã«ãªèªè åãã«æŽå¯ãå®è·µçãªäŸããã¹ããã©ã¯ãã£ã¹ãæäŸããŸãã
æå³èªèãšã¯ïŒ
ãã®æ žãšãªãæå³èªèã·ã¹ãã ã¯ããŠãŒã¶ãŒã®ã¯ãšãªã®æ ¹åºã«ããç®çãç®æšãçè§£ããããšãç®çãšããŠããŸãããŠãŒã¶ãŒããã£ããããããšå¯Ÿè©±ãããšãã圌ãã¯éåžžãäœããéæããããšããŠããŸãã質åãããããªã¯ãšã¹ãããããæ å ±ãæ¢ãããŸãã¯ææ ã衚çŸãããªã©ã§ããæå³èªèãšã¯ããã®ãŠãŒã¶ãŒã®çºè©±ãããã®ç¹å®ã®ç®æšã衚ãå®çŸ©æžã¿ã®ã«ããŽãªã«åé¡ããããã»ã¹ã§ãã
äŸãã°ã次ã®ãŠãŒã¶ãŒã®ã¯ãšãªãèããŠã¿ãŸãããã
- ãæ±äº¬ãžã®ãã©ã€ããäºçŽãããã§ããã
- ãææ¥ã®ãã³ãã³ã®å€©æ°ã¯ã©ãã§ããïŒã
- ãè¿åããªã·ãŒã«ã€ããŠæããŠããã ããŸããïŒã
- ããã®ãµãŒãã¹ã«éåžžã«äžæºãæããŠããŸããã
广çãªæå³èªèã·ã¹ãã ã¯ãããããæ¬¡ã®ããã«åé¡ããŸãã
- æå³:
book_flight - æå³:
get_weather - æå³:
inquire_return_policy - æå³:
express_frustration
æ£ç¢ºãªæå³èªèããªããã°ããã£ãããããã¯é¢é£æ§ã®é«ãå¿çãæäŸããããšã«èŠåŽãããŠãŒã¶ãŒãšã¯ã¹ããªãšã³ã¹ã®äœäžãã²ããŠã¯ãã®æå³ããç®çãéæã§ããªãããšã«ã€ãªãããŸãã
ãã£ãããããã¢ãŒããã¯ãã£ã«ãããæå³èªèã®éèŠæ§
æå³èªèã¯ãã»ãšãã©ã®ææ°ã®ãã£ãããããã¢ãŒããã¯ãã£ã®åºæ¬çãªã³ã³ããŒãã³ãã§ããéåžžãNLUãã€ãã©ã€ã³ã®åé ã«äœçœ®ãããããªãåæã®åã«çã®ãŠãŒã¶ãŒå ¥åãåŠçããŸãã
äžè¬çãªãã£ãããããã¢ãŒããã¯ãã£ã¯æ¬¡ã®ããã«ãªããŸãã
- ãŠãŒã¶ãŒå ¥å: ãŠãŒã¶ãŒããã®çã®ããã¹ããŸãã¯é³å£°ã
- èªç¶èšèªçè§£ (NLU): ãã®ã¢ãžã¥ãŒã«ãå
¥åæ
å ±ãåŠçããŸãã
- æå³èªè: ãŠãŒã¶ãŒã®ç®æšãç¹å®ããŸãã
- ãšã³ãã£ãã£æœåº: çºè©±å ã®äž»èŠãªæ å ±ïŒäŸïŒæ¥ä»ãå ŽæãååïŒãèå¥ããŸãã
- 察話管ç: èªèãããæå³ãšæœåºããããšã³ãã£ãã£ã«åºã¥ããŠããã£ããããããæ¬¡ã«åãã¹ãã¢ã¯ã·ã§ã³ã決å®ããŸããããã«ã¯ãæ å ±ã®ååŸãæç¢ºåã®è³ªåãã¿ã¹ã¯ã®å®è¡ãªã©ãå«ãŸããå ŽåããããŸãã
- èªç¶èšèªçæ (NLG): ãŠãŒã¶ãŒãžã®èªç¶èšèªå¿çãçæããŸãã
- ãã£ãããããå¿ç: ãŠãŒã¶ãŒã«è¿ãããçæãããå¿çã
æå³èªèã¢ãžã¥ãŒã«ã®ç²ŸåºŠãšå ç¢æ§ã¯ãåŸç¶ã®ãã¹ãŠã®æ®µéã®æå¹æ§ã«çŽæ¥åœ±é¿ããŸããæå³ã誀åé¡ããããšããã£ãããããã¯èª€ã£ãã¢ã¯ã·ã§ã³ãå®è¡ããããšããçµæãšããŠç¡é¢ä¿ãŸãã¯åœ¹ã«ç«ããªãå¿çã«ã€ãªãããŸãã
æå³èªèãžã®ã¢ãããŒã
æå³èªèã·ã¹ãã ãæ§ç¯ããã«ã¯ãé©åãªã¢ãããŒããéžæããé©åãªããŒã«ãšã©ã€ãã©ãªã掻çšããå¿ èŠããããŸããäž»èŠãªæ¹æ³ã¯ãã«ãŒã«ããŒã¹ã·ã¹ãã ãšæ©æ¢°åŠç¿ããŒã¹ã·ã¹ãã ã«å€§å¥ã§ããŸãã
1. ã«ãŒã«ããŒã¹ã·ã¹ãã
ã«ãŒã«ããŒã¹ã·ã¹ãã ã¯ããŠãŒã¶ãŒã®æå³ãåé¡ããããã«ãäºåã«å®çŸ©ãããã«ãŒã«ããã¿ãŒã³ãããã³ããŒã¯ãŒãã«äŸåããŸãããããã®ã·ã¹ãã ã¯ãæ£èŠè¡šçŸãŸãã¯ãã¿ãŒã³ãããã³ã°ã¢ã«ãŽãªãºã ã䜿çšããŠå®è£ ãããããšããããããŸãã
é·æ:
- 説æå¯èœæ§: ã«ãŒã«ã¯ééçã§çè§£ããããã§ãã
- å¶åŸ¡æ§: éçºè ã¯æå³ãã©ã®ããã«èªèãããããæ£ç¢ºã«å¶åŸ¡ã§ããŸãã
- ã·ã³ãã«ãªã·ããªãª: äºæž¬å¯èœãªãŠãŒã¶ãŒã®ã¯ãšãªãæã€éåžžã«å¶çŽããããã¡ã€ã³ã«å¹æçã§ãã
çæ:
- ã¹ã±ãŒã©ããªãã£: æå³ã®æ°ãšãŠãŒã¶ãŒèšèªã®ããªãšãŒã·ã§ã³ãå¢ããã«ã€ããŠãã¹ã±ãŒã«ãå°é£ã«ãªããŸãã
- ã¡ã³ããã³ã¹: 倧éã®è€éãªã«ãŒã«ãç¶æããããšã¯ãæéãšæéããããããšã©ãŒãçºçãããããªããŸãã
- èã: ã«ãŒã«ã§æç€ºçã«ã«ããŒãããŠããªãèšãåããå矩èªããŸãã¯ææ³æ§é ã®ããªãšãŒã·ã§ã³ãåŠçã§ããŸããã
Pythonã䜿çšããäŸïŒæŠå¿µçïŒ:
def recognize_intent_rule_based(text):
text = text.lower()
if "book" in text and ("flight" in text or "ticket" in text):
return "book_flight"
elif "weather" in text or "forecast" in text:
return "get_weather"
elif "return policy" in text or "refund" in text:
return "inquire_return_policy"
else:
return "unknown"
print(recognize_intent_rule_based("I want to book a flight."))
print(recognize_intent_rule_based("What's the weather today?"))
ãã®ã¢ãããŒãã¯åçŽã§ããã倿§ãªãŠãŒã¶ãŒå ¥åãæã€çŸå®äžçã®ã¢ããªã±ãŒã·ã§ã³ã«ã¯ããã«äžååã«ãªããŸãã
2. æ©æ¢°åŠç¿ããŒã¹ã·ã¹ãã
æ©æ¢°åŠç¿ïŒMLïŒã®ã¢ãããŒãã¯ãããŒã¿ãããã¿ãŒã³ãåŠç¿ããããã«ã¢ã«ãŽãªãºã ãæŽ»çšããŸããæå³èªèã®å Žåãããã¯éåžžã察å¿ããæå³ã§ã©ãã«ä»ãããããŠãŒã¶ãŒçºè©±ã®ããŒã¿ã»ããã§åé¡ã¢ãã«ããã¬ãŒãã³ã°ããããšãå«ã¿ãŸãã
é·æ:
- å ç¢æ§: èšèªã®ããªãšãŒã·ã§ã³ãå矩èªãææ³æ§é ãåŠçã§ããŸãã
- ã¹ã±ãŒã©ããªãã£: æå³ã®å¢å ãããè€éãªèšèªã«ããããé©å¿ããŸãã
- ç¶ç¶çãªæ¹å: ããå€ãã®ããŒã¿ã§åãã¬ãŒãã³ã°ããããšã§ããã©ãŒãã³ã¹ãåäžãããããšãã§ããŸãã
çæ:
- ããŒã¿äŸåæ§: 倧éã®ã©ãã«ä»ããã¬ãŒãã³ã°ããŒã¿ãå¿ èŠã§ãã
- è€éæ§: ã«ãŒã«ããŒã¹ã·ã¹ãã ãããå®è£ ãšçè§£ãè€éã«ãªãå¯èœæ§ããããŸãã
- ããã©ãã¯ããã¯ã¹ãæ§: äžéšã®MLã¢ãã«ã¯èª¬æå¯èœæ§ãäœãå ŽåããããŸãã
æå³èªèã®æãäžè¬çãªMLã¢ãããŒãã¯ãæåž«ããåé¡ã§ããå ¥åçºè©±ãäžããããå Žåãã¢ãã«ã¯äºåã«å®çŸ©ãããã¯ã©ã¹ã®ã»ããããæãå¯èœæ§ã®é«ãæå³ãäºæž¬ããŸãã
æå³èªèã®ããã®äžè¬çãªMLã¢ã«ãŽãªãºã
- ãµããŒããã¯ã¿ãŒãã·ã³ (SVM): ç°ãªãæå³ã¯ã©ã¹ãåé¢ããããã®æé©ãªè¶ å¹³é¢ãèŠã€ããããšã«ãããããã¹ãåé¡ã«å¹æçã§ãã
- ãã€ãŒããã€ãº: ã·ã³ãã«ã§ãããã¹ãåé¡ã¿ã¹ã¯ã§ãã°ãã°è¯å¥œãªããã©ãŒãã³ã¹ãçºæ®ãã確ççåé¡åšã§ãã
- ããžã¹ãã£ãã¯ååž°: çºè©±ãç¹å®ã®æå³ã«å±ãã確çãäºæž¬ããç·åœ¢ã¢ãã«ã§ãã
- ãã£ãŒãã©ãŒãã³ã°ã¢ãã«ïŒäŸïŒãªã«ã¬ã³ããã¥ãŒã©ã«ãããã¯ãŒã¯ - RNNãç³ã¿èŸŒã¿ãã¥ãŒã©ã«ãããã¯ãŒã¯ - CNNãTransformerïŒ: ãããã®ã¢ãã«ã¯è€éãªæå³é¢ä¿ãæããããšãã§ããå€ãã®NLUã¿ã¹ã¯ã§æå ç«¯ã®æè¡ã§ãã
æå³èªèã®ããã®Pythonã©ã€ãã©ãªãšãã¬ãŒã ã¯ãŒã¯
Pythonã®è±å¯ãªã©ã€ãã©ãªãšã³ã·ã¹ãã ã¯ãæŽç·Žããããã£ãããããæå³èªèã·ã¹ãã ãæ§ç¯ããããã®åªããéžæè¢ãšãªããŸãã以äžã«æãèåãªãã®ãããã€ã玹ä»ããŸãã
1. NLTK (Natural Language Toolkit)
NLTKã¯Pythonã«ãããNLPã®åºç€çãªã©ã€ãã©ãªã§ãããããŒã¯ã³åãã¹ããã³ã°ãã¬ã³ãåãåè©ã¿ã°ä»ããªã©ã®ããŒã«ãæäŸããŸããçµã¿èŸŒã¿ã®ãšã³ãããŒãšã³ãã®æå³èªèã·ã¹ãã ã¯ãããŸããããMLã¢ãã«ã«ããã¹ãããŒã¿ãäŸçµŠããåã®ååŠçã«éåžžã«åœ¹ç«ã¡ãŸãã
äž»ãªçšé: ããã¹ãã¯ãªãŒãã³ã°ãç¹åŸŽæœåºïŒäŸïŒTF-IDFïŒã
2. spaCy
spaCyã¯ãé«åºŠãªNLPã®ããã®éåžžã«å¹ççã§æ¬çªç°å¢å¯Ÿå¿ã®ã©ã€ãã©ãªã§ããæ§ã ãªèšèªåãã®äºååŠç¿æžã¿ã¢ãã«ãæäŸãããã®é床ãšç²ŸåºŠã§ç¥ãããŠããŸããspaCyã¯ãããŒã¯ã³åãåºæè¡šçŸèªèïŒNERïŒãäŸåé¢ä¿è§£æã®ããã®åªããããŒã«ãæäŸãããããã¯æå³èªèã³ã³ããŒãã³ãã®æ§ç¯ã«å©çšã§ããŸãã
äž»ãªçšé: ããã¹ãååŠçããšã³ãã£ãã£æœåºãã«ã¹ã¿ã ããã¹ãåé¡ãã€ãã©ã€ã³ã®æ§ç¯ã
3. scikit-learn
Scikit-learnã¯ãPythonã«ãããäŒçµ±çãªæ©æ¢°åŠç¿ã®äºå®äžã®æšæºã§ããå¹
åºãã¢ã«ãŽãªãºã ïŒSVMããã€ãŒããã€ãºãããžã¹ãã£ãã¯ååž°ïŒãšãç¹åŸŽæœåºïŒäŸïŒTfidfVectorizerïŒãã¢ãã«ãã¬ãŒãã³ã°ãè©äŸ¡ããã€ããŒãã©ã¡ãŒã¿ãã¥ãŒãã³ã°ã®ããã®ããŒã«ãæäŸããŸããMLããŒã¹ã®æå³åé¡åšãæ§ç¯ããããã®é Œãã«ãªãã©ã€ãã©ãªã§ãã
äž»ãªçšé: æå³åé¡ã®ããã®SVMããã€ãŒããã€ãºãããžã¹ãã£ãã¯ååž°ã®å®è£ ãããã¹ããã¯ãã«åã
4. TensorFlow ãš PyTorch
ãã£ãŒãã©ãŒãã³ã°ã¢ãããŒãã®å ŽåãTensorFlowãšPyTorchãäž»èŠãªãã¬ãŒã ã¯ãŒã¯ã§ãããããã«ãããLSTMsãGRUsãTransformersãªã©ã®è€éãªãã¥ãŒã©ã«ãããã¯ãŒã¯ã¢ãŒããã¯ãã£ã®å®è£ ãå¯èœã«ãªãã埮åŠãªèšèªãè€éãªæå³æ§é ãçè§£ããã®ã«éåžžã«å¹æçã§ãã
äž»ãªçšé: æå³èªèã®ããã®ãã£ãŒãã©ãŒãã³ã°ã¢ãã«ïŒRNNãCNNãTransformerïŒã®æ§ç¯ã
5. Rasa
Rasaã¯ãäŒè©±åAIã®æ§ç¯ã®ããã«ç¹å¥ã«èšèšããããªãŒãã³ãœãŒã¹ãã¬ãŒã ã¯ãŒã¯ã§ããæå³èªèãšãšã³ãã£ãã£æœåºã®äž¡æ¹ã®NLUæ©èœãããã³å¯Ÿè©±ç®¡çãå«ãå æ¬çãªããŒã«ããããæäŸããŸããRasaã®NLUã³ã³ããŒãã³ãã¯é«åºŠã«æ§æå¯èœã§ãããæ§ã ãªMLãã€ãã©ã€ã³ããµããŒãããŸãã
äž»ãªçšé: ãšã³ãããŒãšã³ãã®ãã£ãããããéçºãNLUïŒæå³ïŒãšã³ãã£ãã£ïŒã察話管çããããã€ã¡ã³ãã
Pythonæå³èªèã·ã¹ãã ã®æ§ç¯ïŒã¹ããããã€ã¹ãããã¬ã€ã
ã·ã³ãã«ãã®ããã«scikit-learnã䜿çšããMLããŒã¹ã®ã¢ãããŒãã«çŠç¹ãåœãŠãŠãPythonã䜿çšããŠåºæ¬çãªæå³èªèã·ã¹ãã ãæ§ç¯ããããã»ã¹ãèŠãŠãããŸãããã
ã¹ããã1: æå³ã®å®çŸ©ãšãã¬ãŒãã³ã°ããŒã¿ã®åé
æåã®éèŠãªã¹ãããã¯ããã£ããããããåŠçããå¿ èŠããããã¹ãŠã®ç°ãªãæå³ãç¹å®ããåæå³ã®äŸã®çºè©±ïŒutteranceïŒãåéããããšã§ããã°ããŒãã«ãã£ãããããã®å Žåã倿§ãªè¡šçŸãèšèªã¹ã¿ã€ã«ãèæ ®ããŠãã ããã
æå³ãšããŒã¿ã®äŸ:
- æå³:
greet- ãããã«ã¡ã¯ã
- ãããã
- ããã¯ããããããŸãã
- ããã£ã»ãŒïŒã
- ãããããã€ã
- æå³:
bye- ãããããªãã
- ããŸããã
- ããã€ãã€ã
- ããŸãä»åºŠã
- æå³:
order_pizza- ããã¶ã泚æãããã§ããã
- ã倧ããªãããããã¶ããé¡ãã§ããŸããïŒã
- ãããžã¿ãªã¢ã³ãã¶ã泚æããŠãã ãããã
- ããã¶ã®æ³šæããããã§ããã
- æå³:
check_order_status- ãç§ã®æ³šæã¯ã©ãã§ããïŒã
- ãç§ã®ãã¶ã®ç¶æ³ã¯ã©ããªã£ãŠããŸããïŒã
- ãæ³šæã远跡ããŠãã ãããã
- ãé éã¯ãã€å±ããŸããïŒã
ã°ããŒãã«ããŒã¿ã«é¢ãããã³ã: ã°ããŒãã«ãªãŠãŒã¶ãŒãã¿ãŒã²ããã«ããå Žåããã£ãããããããµãŒãã¹ãæäŸããå°åã§äžè¬çãªç°ãªãæ¹èšãå£èªè¡šçŸãããã³ææ§é ãåæ ãããã¬ãŒãã³ã°ããŒã¿ãåéããããã«åªããŠãã ãããããšãã°ãè±åœã®ãŠãŒã¶ãŒã¯ãI fancy a pizzaãïŒãã¶ãé£ã¹ããïŒãšèšããããããŸããããç±³åœã§ã¯ãI want to order a pizzaãïŒãã¶ã泚æãããïŒã®æ¹ãäžè¬çã§ãããã®å€æ§æ§ãéèŠã§ãã
ã¹ããã2: ããã¹ãã®ååŠç
çã®ããã¹ãã¯ãæ©æ¢°åŠç¿ã¢ãã«ã«é©ãã圢åŒã«ã¯ãªãŒãã³ã°ããã³å€æããå¿ èŠããããŸããããã«ã¯éåžžãæ¬¡ã®äœæ¥ãå«ãŸããŸãã
- å°æåå: äžè²«æ§ãä¿ã€ããã«ããã¹ãŠã®ããã¹ããå°æåã«å€æããŸãã
- ããŒã¯ã³å: æãåã ã®åèªãŸãã¯ããŒã¯ã³ã«åè§£ããŸãã
- å¥èªç¹ãšç¹æ®æåã®åé€: æå³çãªæå³ã远å ããªãæåãæé€ããŸãã
- ã¹ãããã¯ãŒãã®åé€: æå³ãžã®åœ±é¿ãå°ãªãäžè¬çãªåèªïŒãaãããtheãããisããªã©ïŒãæé€ããŸãã
- ã¬ã³ãå/ã¹ããã³ã°: åèªããã®åºæ¬åœ¢ãŸãã¯èªæ ¹åœ¢ã«éå ããŸãïŒäŸïŒãrunningãããranãâãrunãïŒãã¬ã³ãåã¯å®éã®åèªã«ãªããããäžè¬çã«æšå¥šãããŸãã
NLTKãšspaCyã䜿çšããäŸ:
import re
import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
import spacy
# Download necessary NLTK data (run once)
# nltk.download('punkt')
# nltk.download('stopwords')
# nltk.download('wordnet')
# Load spaCy model for English (or other languages if needed)
snlp = spacy.load("en_core_web_sm")
lemmatizer = WordNetLemmatizer()
stop_words = set(stopwords.words('english'))
def preprocess_text(text):
text = text.lower()
text = re.sub(r'[^\w\s]', '', text) # Remove punctuation
tokens = nltk.word_tokenize(text)
tokens = [word for word in tokens if word not in stop_words]
lemmas = [lemmatizer.lemmatize(token) for token in tokens]
return " ".join(lemmas)
# Using spaCy for a more robust tokenization and POS tagging which can help lemmatization
def preprocess_text_spacy(text):
text = text.lower()
doc = snlp(text)
tokens = [token.lemma_ for token in doc if not token.is_punct and not token.is_stop and not token.is_space]
return " ".join(tokens)
print(f"NLTK preprocess: {preprocess_text('I want to order a pizza!')}")
print(f"spaCy preprocess: {preprocess_text_spacy('I want to order a pizza!')}")
ã¹ããã3: ç¹åŸŽæœåºïŒãã¯ãã«åïŒ
æ©æ¢°åŠç¿ã¢ãã«ã¯æ°å€å ¥åãå¿ èŠãšããŸããããã¹ãããŒã¿ã¯æ°å€ãã¯ãã«ã«å€æãããªããã°ãªããŸãããäžè¬çãªææ³ã¯æ¬¡ã®ãšããã§ãã
- Bag-of-Words (BoW): 忬¡å ãèªåœå ã®åèªã«å¯Ÿå¿ããå€ããã®åèªã®é »åºŠã§ãããã¯ãã«ãšããŠããã¹ãã衚çŸããŸãã
- TF-IDF (Term Frequency-Inverse Document Frequency): ããã¥ã¡ã³ãå ã§ã®åèªã®éèŠæ§ããã³ãŒãã¹å šäœã§ã®ãã®éèŠæ§ã«å¯ŸããŠéã¿ä»ããããããæŽç·Žãããã¢ãããŒãã§ãã
- åèªåã蟌ã¿ïŒäŸïŒWord2VecãGloVeãFastTextïŒ: åèªéã®æå³é¢ä¿ãæããå¯ãªãã¯ãã«è¡šçŸã§ãããããã¯ãã£ãŒãã©ãŒãã³ã°ã¢ãã«ã§ãã䜿çšãããŸãã
scikit-learnã®TfidfVectorizerã䜿çšããäŸ:
from sklearn.feature_extraction.text import TfidfVectorizer
# Sample preprocessed data
utterances = [
"hello", "hi there", "good morning", "hey", "greetings",
"goodbye", "see you later", "bye bye", "until next time",
"i want to order a pizza", "can i get a large pepperoni pizza", "order a vegetarian pizza please",
"where is my order", "what is the status of my pizza", "track my order"
]
intents = [
"greet", "greet", "greet", "greet", "greet",
"bye", "bye", "bye", "bye",
"order_pizza", "order_pizza", "order_pizza",
"check_order_status", "check_order_status", "check_order_status"
]
preprocessed_utterances = [preprocess_text_spacy(u) for u in utterances]
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(preprocessed_utterances)
print(f"Feature matrix shape: {X.shape}")
print(f"Vocabulary size: {len(vectorizer.get_feature_names_out())}")
print(f"Example vector for 'order pizza': {X[utterances.index('i want to order a pizza')]}")
ã¹ããã4: ã¢ãã«ãã¬ãŒãã³ã°
ããŒã¿ãååŠçãããã¯ãã«åãããããåé¡ã¢ãã«ããã¬ãŒãã³ã°ããŸãããã®äŸã§ã¯scikit-learnã®LogisticRegressionã䜿çšããŸãã
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, intents, test_size=0.2, random_state=42)
# Initialize and train the model
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)
# Evaluate the model
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.2f}")
print("Classification Report:")
print(classification_report(y_test, y_pred, zero_division=0))
ã¹ããã5: äºæž¬ãšçµ±å
ãã¬ãŒãã³ã°åŸãã¢ãã«ã¯æ°ãããæªèŠã®ãŠãŒã¶ãŒçºè©±ã®æå³ãäºæž¬ã§ããŸãã
def predict_intent(user_input, vectorizer, model):
preprocessed_input = preprocess_text_spacy(user_input)
input_vector = vectorizer.transform([preprocessed_input])
predicted_intent = model.predict(input_vector)[0]
return predicted_intent
# Example predictions
print(f"User says: 'Hi there, how are you?' -> Intent: {predict_intent('Hi there, how are you?', vectorizer, model)}")
print(f"User says: 'I\'d like to track my pizza order.' -> Intent: {predict_intent('I\'d like to track my pizza order.', vectorizer, model)}")
print(f"User says: 'What's the news?' -> Intent: {predict_intent('What\'s the news?', vectorizer, model)}")
ãã®åºæ¬çãªMLãã€ãã©ã€ã³ã¯ããã£ããããããã¬ãŒã ã¯ãŒã¯ã«çµ±åã§ããŸããããè€éãªã¢ããªã±ãŒã·ã§ã³ã®å Žåãæå³èªèãšäžŠè¡ããŠãšã³ãã£ãã£æœåºãçµ±åããããšã«ãªããŸãã
é«åºŠãªãããã¯ãšèæ ®äºé
1. ãšã³ãã£ãã£æœåº
åè¿°ã®ããã«ãæå³èªèã¯ãšã³ãã£ãã£æœåºãšçµã¿åããããããšããããããŸãããšã³ãã£ãã£ãšã¯ããŠãŒã¶ãŒã®çºè©±å ã®æå³ã«é¢é£ããç¹å®ã®æ å ±ã®ããšã§ããããšãã°ãã倧ããªãããããã¶ããé¡ãã§ããŸããïŒããšããçºè©±ã§ã¯ããlargeãã¯ãµã€ãºãšã³ãã£ãã£ããpepperoniãã¯ãããã³ã°ãšã³ãã£ãã£ã§ãã
spaCyïŒãã®NERæ©èœãæã€ïŒãNLTKãããã³Rasaã®ãããªãã¬ãŒã ã¯ãŒã¯ã¯ãå ç¢ãªãšã³ãã£ãã£æœåºæ©èœãæäŸããŸãã
2. ææ§ããšç¯å²å€ã¯ãšãªã®åŠç
ãã¹ãŠã®ãŠãŒã¶ãŒå ¥åãå®çŸ©ãããæå³ã«ãããã«ãããã³ã°ãããããã§ã¯ãããŸãããäžéšã¯ææ§ã§ããå¯èœæ§ããããä»ã®äžéšã¯ãã£ãããããã®ç¯å²å€ã§ããå¯èœæ§ããããŸãã
- ææ§ã: ã¢ãã«ã2ã€ä»¥äžã®æå³ã®éã§äžç¢ºå®ãªå Žåããã£ãããããã¯æç¢ºåã®è³ªåãããå¯èœæ§ããããŸãã
- ç¯å²å€ïŒOOSïŒæ€åº: ã¯ãšãªãæ¢ç¥ã®æå³ãšäžèŽããªãããšãæ€åºããã¡ã«ããºã ãå®è£ ããããšãéèŠã§ããããã¯éåžžãäºæž¬ã®ä¿¡é ŒåºŠãããå€ãèšå®ããããç¹å®ã®ãout_of_scopeãæå³ããã¬ãŒãã³ã°ããããšãå«ã¿ãŸãã
3. å€èšèªæå³èªè
ã°ããŒãã«ãªãŠãŒã¶ãŒã«ãšã£ãŠãè€æ°ã®èšèªããµããŒãããããšã¯äžå¯æ¬ ã§ããããã¯ããã€ãã®æŠç¥ã«ãã£ãŠéæã§ããŸãã
- èšèªæ€åº + åå¥ã®ã¢ãã«: ãŠãŒã¶ãŒã®èšèªãæ€åºãããã®å ¥åãèšèªåºæã®NLUã¢ãã«ã«ã«ãŒãã£ã³ã°ããŸããããã«ã¯ãåèšèªããšã«åå¥ã®ã¢ãã«ããã¬ãŒãã³ã°ããå¿ èŠããããŸãã
- ã¯ãã¹ãªã³ã¬ã«åã蟌ã¿: ç°ãªãèšèªã®åèªãå ±æãã¯ãã«ç©ºéã«ãããã³ã°ããåèªåã蟌ã¿ã䜿çšããããšã§ãåäžã®ã¢ãã«ã§è€æ°ã®èšèªãåŠçã§ããããã«ããŸãã
- æ©æ¢°ç¿»èš³: åŠçåã«ãŠãŒã¶ãŒå ¥åãå ±éèšèªïŒäŸïŒè±èªïŒã«ç¿»èš³ãããã£ãããããã®å¿çãå床翻蚳ããŸããããã«ããã翻蚳ãšã©ãŒãçºçããå¯èœæ§ããããŸãã
Rasaã®ãããªãã¬ãŒã ã¯ãŒã¯ã«ã¯ãå€èšèªNLUã®çµã¿èŸŒã¿ãµããŒãããããŸãã
4. ã³ã³ããã¹ããšç¶æ 管ç
çã«äŒè©±åã®ãã£ãããããã¯ãäŒè©±ã®ã³ã³ããã¹ããèšæ¶ããå¿ èŠããããŸããããã¯ãæå³èªèã·ã¹ãã ãçŸåšã®çºè©±ãæ£ããè§£éããããã«ã察話ã®åã®ã¿ãŒã³ãèæ ®ããå¿ èŠããããããããªãããšãæå³ããŸããããšãã°ããã¯ããããã§ããããšããçºè©±ã¯ã以åã®ã³ã³ããã¹ããããããããäœãæãã®ããçè§£ããå¿ èŠããããŸãã
5. ç¶ç¶çãªæ¹åãšã¢ãã¿ãªã³ã°
æå³èªèã·ã¹ãã ã®ããã©ãŒãã³ã¹ã¯ããŠãŒã¶ãŒã®èšèªãé²åããæ°ãããã¿ãŒã³ãåºçŸããã«ã€ããŠæéãšãšãã«äœäžããŸãã以äžã®ããšãäžå¯æ¬ ã§ãã
- ãã°ã®ç£èŠ: äŒè©±ã宿çã«ã¬ãã¥ãŒããŠã誀解ãããã¯ãšãªã誀åé¡ãããæå³ãç¹å®ããŸãã
- ãŠãŒã¶ãŒãã£ãŒãããã¯ã®åé: ãã£ãããããããŠãŒã¶ãŒã誀解ãããšãã«ããŠãŒã¶ãŒãå ±åã§ããããã«ããŸãã
- ã¢ãã«ã®åãã¬ãŒãã³ã°: 粟床ãåäžãããããã«ããã°ããã£ãŒãããã¯ããã®æ°ããããŒã¿ã䜿çšããŠã¢ãã«ã宿çã«åãã¬ãŒãã³ã°ããŸãã
æå³èªèã®ããã®ã°ããŒãã«ãªãã¹ããã©ã¯ãã£ã¹
ã°ããŒãã«ãªãŠãŒã¶ãŒåãã«ãã£ããããããæ§ç¯ããéãæå³èªèã«é¢ãã以äžã®ãã¹ããã©ã¯ãã£ã¹ãéèŠã§ãã
- å æ¬çãªããŒã¿åé: ãã£ãããããããµãŒãã¹ãæäŸãã倿§ãªäººå£çµ±èšãå°åãèšèªçèæ¯ãããã¬ãŒãã³ã°ããŒã¿ãåéããŸãã1ã€ã®å°åãŸãã¯èšèªããªã¢ã³ãããã®ããŒã¿ã®ã¿ã«äŸåããããšã¯é¿ããŠãã ããã
- æåçãã¥ã¢ã³ã¹ã®èæ ®: ãŠãŒã¶ãŒã®è¡šçŸã¯æåã«å€§ãã圱é¿ãããŸããããšãã°ãäžå¯§ãã®ã¬ãã«ãçŽæ¥æ§ãäžè¬çãªæ £çšå¥ã¯å€§ããç°ãªããŸãããããã®éããèªèããããã«ã¢ãã«ããã¬ãŒãã³ã°ããŸãã
- å€èšèªããŒã«ã®æŽ»çš: è€æ°ã®èšèªã«å¯ŸããŠå ç¢ãªãµããŒããæäŸããNLUã©ã€ãã©ãªããã¬ãŒã ã¯ãŒã¯ã«æè³ããŸããããã¯ãåèšèªã«å¯ŸããŠå®å šã«åå¥ã®ã·ã¹ãã ãæ§ç¯ãããããå¹ççã§ããããšããããããŸãã
- OOSæ€åºã®åªå : ã°ããŒãã«ãªãŠãŒã¶ãŒããŒã¹ã¯ãå®çŸ©ãããæå³å€ã®ã¯ãšãªãå¿ ç¶çã«çæããŸãã广çãªç¯å²å€ïŒOOSïŒæ€åºã¯ããã£ããããããæå³äžæãŸãã¯ç¡é¢ä¿ãªå¿çãæäŸããããšãé²ããããã¯ãã¯ãããžãŒã«æ £ããŠããªããŠãŒã¶ãŒã«ãšã£ãŠç¹ã«èç«ã¡ã®åå ãšãªãå¯èœæ§ããããŸãã
- 倿§ãªãŠãŒã¶ãŒã°ã«ãŒããšã®ãã¹ã: ã°ããŒãã«ã«å±éããåã«ãç°ãªãåœãæååã®ããŒã¿ãŠãŒã¶ãŒãšåºç¯ãªãã¹ãã宿œããŸãã圌ãã®ãã£ãŒãããã¯ã¯ãèŠéããŠãããããããªãæå³èªèã®åé¡ãç¹å®ããã®ã«éåžžã«è²Žéã§ãã
- æç¢ºãªãšã©ãŒåŠç: æå³ã誀解ãããå ŽåããŸãã¯OOSã¯ãšãªãæ€åºãããå Žåãæç¢ºã§åœ¹ç«ã€ãæåçã«é©åãªä»£æ¿å¿çãæäŸããŸãã人éã®ãšãŒãžã§ã³ãã«æ¥ç¶ãããªãã·ã§ã³ããã¯ãšãªãèšãæãããªãã·ã§ã³ãæäŸããŸãã
- 宿çãªç£æ»: æå³ã«ããŽãªãšãã¬ãŒãã³ã°ããŒã¿ã宿çã«ç£æ»ããã°ããŒãã«ãªãŠãŒã¶ãŒããŒã¹ã®é²åããããŒãºãšèšèªã«åŒãç¶ãé¢é£æ§ãããã代衚çã§ããããšã確èªããŸãã
çµè«
æå³èªèã¯ã广çãªäŒè©±åAIã®ç€ç³ã§ããPythonãã£ãããããéçºã«ãããŠããã®åéãç¿åŸããã«ã¯ãNLUååã®æ·±ãçè§£ãæ éãªããŒã¿ç®¡çãããã³åŒ·åãªã©ã€ãã©ãªãšãã¬ãŒã ã¯ãŒã¯ã®æŠç¥çãªé©çšãå¿ èŠã§ããå ç¢ãªæ©æ¢°åŠç¿ã¢ãããŒããæ¡çšããããŒã¿å質ãšå€æ§æ§ã«çŠç¹ãåœãŠãã°ããŒãã«ãªãã¹ããã©ã¯ãã£ã¹ãéµå®ããããšã§ãéçºè ã¯äžçäžã®ãŠãŒã¶ãŒãçè§£ãããµãŒãã¹ãæäŸããäžã§åªãããã€ã³ããªãžã§ã³ãã§é©å¿æ§ã®é«ãããŠãŒã¶ãŒãã¬ã³ããªãŒãªãã£ããããããæ§ç¯ã§ããŸããäŒè©±åAIãæçãç¶ããã«ã€ããŠããŠãŒã¶ãŒã®æå³ãæ£ç¢ºã«è§£èªããèœåã¯ãæåãããã£ãããããã¢ããªã±ãŒã·ã§ã³ã«ãšã£ãŠéèŠãªå·®å¥åèŠå ã§ããç¶ããã§ãããã