Search / Korean Journal of Chemical Engineering
Korean Chemical Engineering Research,
Vol.57, No.6, 781-789, 2019
반응 위험성분석 및 사고방지를 위한 스마트 합성경로 탐색시스템
Smart Synthetic Path Search System for Prevention of Hazardous Chemical Accidents and Analysis of Reaction Risk
연구실 실험, 파일럿 플랜트 및 반응기 운전 중 화학물질에 의한 안전사고가 발생하고 있다. 합성 실험을 시작하기전 사고예방을 위해 관련 정보들을 찾아볼 필요가 있으며, 공정설계 단계에서도 반응 폭주 예방을 위한 반응 정보의 확보는 필수적이다. 합성반응 관련 정보는 인터넷을 포함해 다양한 source가 존재하지만, 검색에 오랜 시간이 걸리고, 합성법마다 사용되는 물질도 달라 적정경로 선택의 어려움이 있다. 연구자들의 합성경로 검색시간 단축과 합성 시 존재할 수 있는 위험성 및 중간생성물질들의 확인에 도움을 주고자 본 연구는 스마트 합성경로 탐색시스템을 제안하였다. 제안한 탐색시스템은 Python 패키지인 Selenium을 사용한 Web scraping 및 Web crawling을 통해 인터넷에 존재하는 정보를 수집하여 DB를 자동으로 갱신한다. 경로 탐색 알고리즘은 depth-first search에 기반하여 목표 물질을 기준으로 탐색을 진행하고, 유해화학물질 등급, 수율 등을 구분하여, 제한된 경로단계 수치내에 있는 모든 합성 경로를 제안한다. 또한 각자의 연구 목적에 맞게 연구원들이 가진 비공개 데이터를 형식을 맞춰 DB에 등록하여 확장할 수 있다. 시스템은 차후에 무료 사용이 가능하도록 open source로 공개할 예정이다. 개발 시스템은 연구자들이 제안된 경로를 참고하여 더 안전한 반응 방법을 찾고, 사고의 예방에도 도움을 줄 것으로 기대된다.
There are frequent accidents by chemicals during laboratory experiments and pilot plant and reactor operations. It is necessary to find and comprehend relevant information to prevent accidents before starting synthesis experiments. In the process design stage, reaction information is also necessary to prevent runaway reactions. Although there are various sources available for synthesis information, including the Internet, it takes long time to search and is difficult to choose the right path because the substances used in each synthesis method are different. In order to solve these problems, we propose an intelligent synthetic path search system to help researchers shorten the search time for synthetic paths and identify hazardous intermediates that may exist on paths. The system proposed in this study automatically updates the database by collecting information existing on the Internet through Web scraping and crawling using Selenium, a Python package. Based on the depth-first search, the path search performs searches based on the target substance, distinguishes hazardous chemical grades and yields, etc., and suggests all synthetic paths within a defined limit of path steps. For the benefit of each research institution, researchers can register their private data and expand the database according to the format type. The system is being released as open source for free use. The system is expected to find a safer way and help prevent accidents by supporting researchers referring to the suggested paths.
[References]
  1. National Fire Agency, 2018 Dangerous Goods Statistics Data, 121-139(2018).
  2. Korea Occupational Safety and Health Agency, Accident Investigation on the Explosion During Butadiene Experiment, 2016-Specialty-424(2016).
  3. Szymkuc S, Gajewska EP, Klucznik T, Molga K, Dittwald P, Startek M, Bajczyk M, Grzybowski BA, Angew. Chem.-Int. Edit., 55(20), 5904, 2016
  4. Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH, Nucleic Acids Research, 44(D1), D1202-D1213(2015).
  5. Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP, Nucleic Acids Research, 40(1), D1100-D1107(2011).
  6. Kouranov A, Xie L, de la Cruz J, Chen L, Westbrook J, Bourne PE, Berman HM, Nucleic Acids Research, 34(1), D302, 2006
  7. Pence HE, Williams A, J. Chem. Educ., 87(11), 1123, 2010
  8. MOLBASE homepage, http://www.molbase.com/.
  9. ChemSrc homepage, https://www.chemsrc.com/en/.
  10. Landrum G, RDKit Documentation, Release (2017).
  11. Mitchell R, Web Scraping with Python: Collecting More Data from the Modern Web, 2nd ed., O'Reilly Media, Inc., Sebastopol (2018).
  12. O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR, Journal of Cheminformatics, 3(1), 33(2011).
  13. Homer RW, Swanson J, Jilek RJ, Hurst T, Clark RD, J. Chem. Infor. Modeling, 48(12), 2294, 2008
  14. DAYLIGHT Chemical Information Systems, SMARTS - A Language for Describing Molecular Patterns, http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html.
  15. DAYLIGHT Chemical Information Systems, A Reaction Transform Language, http://www.daylight.com/dayhtml/doc/theory/theory.smirks.html.
  16. Neapolitan, R. E., Foundations of Algorithms, 5th ed., Jones & Bartlett Learning, Burlington(2015).
  17. Panico R, Powell WH, Richer JC, Blackwell Scientific Publications, Oxford, (1993).
  18. Heller SR, McNaught A, Pletnev I, Stein S, Tchekhovskoi D, Journal of Cheminformatics, 7(1), 23(2015).
  19. Weininger D, Journal of Chemical Information and Computer Sciences, 28(1), 31-36(1988).
  20. Dittmar PG, Stobaugh RE, Watson CE, Journal of Chemical Information and Computer Sciences, 16(2), 111-121(1976).
  21. OSHA, List of Highly Hazardous Chemicals, Toxics and Reactives (Mandatory), https://www.osha.gov/law-regs.html.
  22. NCI/CADD Group, Chemical Identifier Resolver, https://cactus.nci.nih.gov/chemical/structure.
  23. CAS, SciFinder, https://www.cas.org/products/scifinder.
  24. ChemAxon, MarvinSketch, https://chemaxon.com/products/marvin.