Skip to content

lucasushi/NER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

UST-NER-Project

This project will achieve NER system on Chinese entity and people's names.

Reference links

  1. jieba: https://github.com/fxsjy/jieba

  2. colah's blog: http://colah.github.io/ (we can study some ML topics in this blog)

  3. machine learning book: http://machinelearningbook.com (800+ pages cover ML topics in details)

  4. English and Hindi NER: https://github.com/monikkinom/ner-lstm

  5. Stanford assignment: https://github.com/Observerspy/CS224n/tree/master/assignment3

  6. tushare news: http://tushare.org (retrieving data)

Useful Info

To-do lists

  • To build a GUI to construct the training and testing dataset (checked)

  • To download news data and cut them into sentences

  • To familiarize with Recurrent Neural Networks (hold a seminar)

  • To carefully study current NER code for English and Hindi (hold a seminar later)

  • To design a deep learning model for Chinese NER system

  • To test the model on our testing data as well as out-source data

  • To design a simple prototype for the NER system

  • Further studies: improve the NER system, Company Logo detection project (using CNN), web-crawling plugins, etc.

Work flow

Part 1-- data gathering/pre-processing <-- which is very important too eg, how can i systematically collect data, maybe writing a web crawling program to crawl forum

Part 2 -- use existing model to get a sense of the "goodness" of the training data

Part 3 -- try to implement the model using tensorflow or if there are one, copy it

Part 4 -- twist the model so that it work well for our problem/data

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published