Is there a library to see the weights being learned by RASA NLU interpreter after training? - nlp

I'm using RASA NLU for entity extraction from healthcare invoices. I have manually tagged and trained OCR extracted data from these invoices with 10 relevant entities(Clinic name, clinic address...). RASA implicitly learns several features from the data. Is there any way to see these features and weights being learned for these features?
I have already tried ELI5 but it doesn't support rasa interpreter it seems.
Has anyone come across this? Thanks in advance

Related

NLP (Natural Language processing), Machine Learning,deep Learning

i am developing an ASR system for my local kashmiri language and had Already had done some work collected some data and trained a CNN model on it with not so good accuracy but now i want to change the Strategy i want to do this work on phoneme level can any buddy suggest me the way to do it please.
Thanks in Advance.

Analytics tools for Rasa

I have been doing some research for analytics solutions for a chatbot I am currently building using the Rasa framework. I found some solutions such as chatbase and dashbot. The documents dont provide this information, but does Rasa provide analytics tools? If not, what other tools would you recommend? I am looking for an analytic tool that will allow me to see what questions are asked the most. Thank you in advance!
There are currently no dedicated open source analytic tools for Rasa.
What I suggest to do is either using a persistent tracker store, e.g. Mongo, or connect an event broker, e.g. RabbitMq, to Rasa Core and store the streamed events as you like.
You can then write queries on top of that, or visualize the data with tools like Prometheus and Grafana.
As mentioned above you can dump your logs to mongoDb and then do the analytics on your own, the challenge here is that you have to label you correct classification and the wrong one (i.e TP, FP etc)
If you want to see frequent questions, what you can do is to try and cluster together all similar sentences (first in a naive manner):
Calculate for each sentence a sum of all the vectors of the words and then calculate the cosine distance between them, here is a good start:
https://stackoverflow.com/a/35092200/1340604
By the way AFAIK Rasa offers analytics tool as part of their Rasa core enterprise license

How to classify a sentence based on a few templates?

I have a question about how chatbot platforms such as api.ai and wit.ai work. You only provide them with a few templates for each intent/class and then their system is able to classify the new user queries to one of those classes.
I am trying to do the same and have tested algorithms such as SVM, logistic regression, naive bayes and a few other ones. For the feature extraction part I have tried most of the features that people usually use such as n-grams, vector representation, bow etc.
What would be the best algorithm(s) in such case when there is a few templates per class?
Thanks
Amir :)
You can't really do any good classification based on few examples. The way established platforms may do that is by looking for similar intents, examples, etc, which are already provided by other clients or creators of the system and then trying to extrapolate.
So the point is that they have some prior data already and they base the classification outcome on that prior.
What you could do though -- is to mine any open sources you can find for examples similar to yours. And then you can try to train your classifier on that. However, in any case it is worth having some reasonable set of training data first.

What do I need to know on NLP to be able to use and train Stanford NLP for intent analysis? [closed]

Any books, tutorials, course reccommedations would be much appreciated.
I need to know at what level I need to be regarding NLP to be able to comprehend the Stanford NLP and train it to customize it for my app of commercial sentiment analysis.
My goal is not a career in NLP or become an expert in NLP but only to be as much proficient to be able to understand and use the open source NLP frameworks properly and train them for my application.
For this level, what NLP study/training would be needed?
I'm learning c# and .net as well.
First: to simply use a sentiment model or train on existing data, there is not too much background to learn:
Tokenization
Constituency parsing, parse trees, etc.
Basic machine learning concepts (classification, cost functions, training / development sets, etc.)
These are well-documented ideas and are all a Google away. It might be worth it to skim the Coursera Natural Language Processing course (produced by people here at Stanford!) for the above ideas as well.
After that, the significant task is understanding how the RNTN sentiment model inside CoreNLP works. You don't need to grasp the math fully, I suppose, but the basic recursive nature of the algorithm is important to understand. The best resource is of course the original paper (and there's not much else, to be honest).
To train your own sentiment model, you'll need your own sentiment data. Producing this data is no small task. The data for the Stanford sentiment model was crowdsourced, and you may need to do something similar if you want to collect anything near the same scale.
The RNTN sentiment paper (linked above) gives some details on the data format. I'm happy to expand on this further if you do wish to create your own data.
I think you should simply comprehend the concept of supervised learning, unsupervised learning. In addition, some Java knowledge might be useful.

Twitter sentiment - free existing tool? [closed]

I need a tool to analyze tweets sentiment (positive/neutral/negative). I would like it to be free for students.
So far I found Sentiment140 (http://www.sentiment140.com/) and Sanders-Twitter Sentiment Corpus (http://www.sananalytics.com/lab/twitter-sentiment/). Although both tools provides only training data (I know, I can use them with NLTK (http://nltk.org/)).
But I would like to know if there is any python script or a Java program, that is free distributed and can be download and run to analyze tweets sentiment. I need to analyze huge set of data, so using Sentiment140 REST API is also unsatisfying.
Any ideas?
PS. I'm interested in english sentiment analyse.
You can use TextBlob, a python package to analyze short texts for sentiment offline. There is a trained model included with the package, so you won't need to train one. It should be fine for some projects, but if you want to achieve greater accuracy, consider training your own model on your specific data.

Resources