卓越的学术带头人
引领AI技术发展、打造核心引擎
张少典
创始人/CEO
博士 高级工程师
 

上海市“千人计划”、上海市浦江人才
2017福布斯亚洲30位30岁以下杰出青年(30under30,医疗科技领域)
2017胡润中国30位30岁以下创业领袖
上海交通大学计算机系APEX数据与知识管理实验室特聘研究员
中国医药信息学会(CMIA)理论与教育委员会委员
中国卫生信息与健康医疗大数据学会卫生信息学教育专业委员会委员
中国卫生信息与健康医疗大数据学会儿科专业委员会顾问
中国民族卫生协会信息化专业委员会专家委员

自2011年起,长期从事医学信息学领域的科研工作。博士期间参与多个由美国国家自然科学基金、美国国家癌症研究所等机构资助的基金项目,并参与纽约长老会医院临床数据建模、自然语言处理 等项目研发。已在医学信息学顶尖国际期刊和会议JAMIA、JBI、AMIA上发表十余篇论文,并且于 2014年和2016年两次获得美国医学信息学会(AMIA)年度大会(AMIA Symposium)最佳博士生 论文提名(Best student paper finalist),于2016年获得AMIA大会CPHI子领域最佳博士生 论文奖。所发表的学术论文的Google scholar总引用次数超过600次(截至2018年10月)。博士期 间长期担任JAMIA、JBI、JMIR等顶尖期刊及AMIA、ACL等顶尖会议的审稿人。2010年和2012年曾于 微软亚洲研究院、微软总部Redmond研究院全职实习,从事数据挖掘、自然语言处理研发工作。

雄厚的技术研发实力
为AI提供弹药、夯实基础
300+
公司300+人团队
1
行业NO.1中文医学NLP论文发表量
30%
硕士、博士占30%
40+
研发团队发表SCI论文40余篇
100+
医学背景100+人

以下为部分论文:

 

Shaodian Zhang, Erin O'Carroll Bantum, Jason Owen, Suzanne Bakken, and Noemie Elhadad,Online cancer communities as informatics intervention for social support: Conceptualization, Characterization, and Impact, Journal of American Medical Informatics Association (JAMIA).

 

Junjie Xing, Kenny Q. Zhu, and Shaodian Zhang, Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text, COLING 2018.

 

Zhenghui Wang, Yanru Qu, Liheng Chen, Jian Shen, Weinan Zhang, Shaodian Zhang, Yimei Gao, Gen Gu, Ken Chen, and Yong Yu, Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition, NAACL 2018.

 

Zhe Jian, Xusheng Guo, Shijian Liu, Handong Ma,Shaodian Zhang, Rui Zhang, Jianbo Lei, A Cascaded Approach for Chinese Clinical Text De-Identification with Less Annotation Effort, Journal of Biomedical Informatics (JBI).


Shaodian Zhang, Tian Kang, Lin Qiu, Weinan Zhang, Yong Yu, and Noemie Elhadad, Cataloguing treatments discussed and used in online autism communities, 2017 International World Wide Web Conference (WWW) (acceptance rate: 17%).

 

Shaodian Zhang, Lin Qiu, Frank Chen, Weinan Zhang, Yong Yu, and Noemie Elhadad, "We Make Choices We Think are Going to Save Us": Debate and Stance Identification for Online Breast Cancer CAM Discussions, 2017 International World Wide Web Conference (WWW).


Erin O’Carroll Bantum, Noemie Elhadad, Jason E. Owen,Shaodian Zhang, Mitch Golant, Joanne Buzaglo, Joanne Stephen and Janine Giese-Davis, Machine Learning for Identifying Emotional Expression in Text: Improving the Accuracy of Established Methods, Journal of Technology in Behavioral Science.

 

Tian Kang,Shaodian Zhang, Youlan Tang, Gregory W. Hruby, Alexander Rusanov, Noemie Elhadad, and Chunhua Weng, EliIE: An Open-Source Information Extraction System for Clinical Trial Eligibility Criteria, Journal of American Medical Informatics Association (JAMIA).


Shaodian Zhang, Edouard Grave, Elizabeth Sklar, and Noemie Elhadad, Longitudinal Analysis of Discussion Topics in an Online Breast Cancer Community using Convolutional Neural Networks, Journal of Biomedical Informatics (JBI).

 

Tian Kang,Shaodian Zhang, Xingting Zhang, Dong Wen, and Jianbo Lei, Detecting Negation and Scope in Chinese Clinical Notes using Character and Word Embedding, Computer Methods and Programs in Biomedicine.


Shaodian Zhang, and Noemie Elhadad, Factors Contributing to Dropping-out in an Online Health Community: Static and Longitudinal Analyses, AMIA 2016.Best student paper finalist.

 

Shaodian Zhang, Tian Kang, Xingting Zhang, Dong Wen, Noemie Elhadad, and Jianbo Lei, Speculation Detection for Chinese Clinical Notes: Impacts of Word Segmentation and Embedding Models, Journal of Biomedical Informatics (JBI).


Shaodian Zhang, Erin Bantum, Jason Owen, and Noemie Elhadad, Does Sustained Participation in an Online Health Community Affect Sentiment?, AMIA 2014.Best student paper finalist.

 

Noemie Elhadad,Shaodian Zhang, Patricia Driscoll, and Samuel Brody, Characterizing the Sublanguage of Online Breast Cancer Forums for Medications, Symptoms, and Emotions., AMIA 2014.


Xiaohua Liu, Furu Wei,Shaodian Zhang, and Ming Zhou, Named Entity Recognition for Tweets., ACM Transactions on Intelligent Systems and Technology (TIST).

 

Shaodian Zhang and Noemie Elhadad, Unsupervised Biomedical Named Entity Recognition: Experiments with Clinical and Biological Texts, Journal of Biomedical Informatics (JBI).


Xiaohua Liu,Shaodian Zhang, Furu Wei, and Ming Zhou, Recognizing Named Entities in Tweets, ACL 2011.

 

Shaodian Zhang, Hai Zhao, Guodong Zhou, and Bao-liang Lu, Hedge Detection and Scope Finding by Sequence Labeling with Normalized Feature Selection, CoNLL 2010.

 

Yanru Qu, Zhenghui Wang, Lin Qiu, Ken Chen, Shaodian Zhang, Yong Yu. Sampled in Pairs and Driven by Text: A New Graph Embedding Framework. WWW'19. 2019.

 

Kenny Zhu, Junjie Xing, Shaodian Zhang. Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text. Proceedings of the 27th International Conference on Computational Linguistics. 2018.

 

Weinan Zhang, Zhenghui Wang, Shaodian Zhang,Yimei Gao,Gen Gu,Ken Chen. Label-Aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018.

全球领先、成熟的中文医学自然语言处理技术
让AI像医生一样读懂病历、提炼信息、获取知识
顶尖的商用性能
完整的技术体系
领先的学术能力
真实环境下的病历理解和信息抽取人机 PK实验显示:森亿智能的自然语言处理引擎独立运作,能够数百倍提升录入效率,并提升准确度9%;和人工解析联合使用,能够辅助医生将病历解析、信息抓取的速度提升 34%,准确度提升 12%。
*参考文献:Improving the efficacy of data entry process for clinical research with an NLP-driven medical information extraction system: a quantitative field research, Shijian Liu, Jiang Han, Lei Fang, Shaodian Zhang, Fei Wang, Handong Ma, Ken Chen, to appear, Journal of Medical Internet Research

25个科室病历、上百种检查报告的解析算法。

识别110 大类临床信息,50 多类语义关联,知识图谱包含51万余概念,2700万条关联,支持对SNOMED-CT、ICD、MedDRA、ATC、LOINC等10余种国际主流医学术语标准的映射,支持国内及全球数据的互联互通。

包含从分词、命名实体识别、Chunking、Conjunction Analysis、 Syntactic Parsing、Semantic Parsing、Entity Normalization全部NLP套件,单组件性能均超95%,端到端性能90+%。

迁移学习体系使得标注量下降97%,标注效率提升30%+。

森亿智能团队在人工智能与医学信息学顶级会议与杂志上有数十篇论文发表,包括JAMA internal medicine, Bioinformatics, Briefings in Bioinformatics, Journal of Biomedical Informatics, Journal of American Medical Informatics Association, Journal of Medical Internet Research, AAAI,KDD, IJCAI, ACL, NAACL, COLING, NIPS, ICML等。
卓越的医院信息、医疗数据集成和治理能力
为医院数据化、智能化建设提供坚强后盾
3871+
3871 个以上元数据
140+
团队 140+ 大型三甲医院数据集成经验
765+
765 个以上数据集成&补全&纠错规则
5
数据集成效率 5 倍于传统方法
100亿+
治理集成 100 亿+条数据
领先的完整临床机器学习建模技术
让AI洞见医学数据、刻画临床需求
3
3位一体自动化系统
拥有可配置型临床风险评估模型生产、部署、监控一体化系统
10+
10种以上机器学习模型支持
能对主流机器学习模型,如Lasso、Naive Bayes、LDA、SVM、RandomForest、DeepForest、XGBoost、LightBoost、CatBoost、Deep Neural Network等,通过生物群智能搜索算法自动调节参数筛选高性能模型或模型组,提供场景下最优机器学习解决方案
10000+
10000个以上患者风险因子组合筛选
直接从电子病历系统中对接数千个预测因子,分析识别预测因子间的交互方式和非线性关系