Contained in:
Book Chapter

A Comparative Study of Deep Learning Models for Symbol Detection in Technical Drawings

  • Benedikt Faltin
  • Damaris Gann
  • Markus König

Symbols are a universal way to convey complex information in technical drawings since they can represent a wide range of elements, including components, materials, or relationships, in a concise and space-saving manner. Therefore, to enable a digital and automatic interpretation of pixel-based drawings, accurate detection of symbols is a crucial step. To enhance the efficiency of the digitization process, current research focuses on automating this symbol detection using deep learning models. However, the ever-increasing repertoire of model architectures poses a challenge for researchers and practitioners alike in retaining an overview of the latest advancements and selecting the most suitable model architecture for their respective use cases. To provide guidance, this contribution conducts a comparative study of prevalent and state-of-the-art model architectures for the task of symbol detection in pixel-based construction drawings. Therefore, this study evaluates six different object detection model architectures, including YOLOv5, YOLOv7, YOLOv8, Swin-Transformer, ConvNeXt, and Faster-RCNN. These models are trained and tested on two distinct datasets from the bridge and residential building domains, both representing substantial sub-sectors of the construction industry. Furthermore, the models are evaluated based on five criteria, i.e., detection accuracy, robustness to data scarcity, training time, inference time, and model size. In summary, our comparative study highlights the performance and capabilities of different deep learning models for symbol detection in construction drawings. Through the comprehensive evaluation and practical insights, this research facilitates the advancement of automated symbol detection by showing the strengths and weaknesses of the model architectures, thus providing users with valuable guidance in choosing the most appropriate model for their real-world applications

  • Keywords:
  • Computer Vision,
  • Technical Drawings,
  • Symbol Detection,
  • Comparative Study,
+ Show More

Benedikt Faltin

Ruhr-University Bochum, Germany - ORCID: 0000-0003-1354-7817

Damaris Gann

Ruhr-University Bochum, Germany

Markus König

Ruhr-University Bochum, Germany - ORCID: 0000-0002-2729-7743

  1. Adam, S., Ogier, J. M., Cariou, C., Mullot, R., Labiche, J., & Gardes, J. (2000). Symbol and character recognition: application to engineering drawings. International Journal on Document Analysis and Recognition, 3(2), 89–101. DOI: 10.1007/s100320000033
  2. Ah-Soon, C. (1998). A constraint network for symbol detection in architectural drawings. In K. Tombre & A.K. Chhabra (Eds.), Lecture Notes in Computer Science. Springer. DOI: 10.1007/3-540-64381-8_41
  3. Brößner, P., Hohlmann, B., & Radermacher, K. (2022). Transformer vs. CNN: A Comparison on Knee Segmentation in Ultrasound Images. In F. Rodriguez Y Baena, J. W. Giles & E. Stindel (Eds.), Proceedings of the 20th Annual Meeting of the International Society for Computer Assisted Orthopaedic Surgery, Vol. 5, 31–36. DOI: 10.29007/cqcv
  4. Deng, J., Dong, W., Socher, R., Li, L.-J., Kai Li, & Li Fei-Fei (2009 ImageNet: A large-scale hierarchical image database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248–255. DOI: 10.1109/CVPR.2009.5206848
  5. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv. DOI: 10.48550/arXiv.2010.11929
  6. Elyan, E., Jamieson, L., & Ali-Gombe, A. (2020). Deep learning for symbols detection and classification in engineering drawings. Neural networks, Vol. 129, 91–102. DOI: 10.1016/j.neunet.2020.05.025
  7. Elyan, E., Moreno-García, C. F., & Johnston, P. (2020). Symbols in Engineering Drawings (SiED): An Imbalanced Dataset Benchmarked by Convolutional Neural Networks. In L. Iliadis, P. P. Angelov, C. Jayne, & E. Pimenidis (Eds.), Proceedings of the 21st EANN (Engineering Applications of Neural Networks) 2020 Conference, 215–224. Springer. DOI: 10.1007/978-3-030-48791-1_16
  8. Faltin, B., Schönfelder, P., & König, M. (2023). Inferring Interconnections of Construction Drawings for Bridges Using Deep Learning-based Methods. In E. Hjelseth, S. F. Sujan & R. J. Scherer (Eds.), ECPPM 2022-eWork and eBusiness in Architecture, Engineering and Construction 2022, 343-350. CRC Press. DOI: 10.1201/9781003354222
  9. Faltin, B., Schönfelder, P., & König, M. (2023). Improving Symbol Detection on Engineering Drawings Using a Keypoint-Based Deep Learning Approach. The 30th EG-ICE: International Conference on Intelligent Computing in Engineering. https://www.ucl.ac.uk/bartlett/construction/sites/bartlett_construction/files/1889.pdf
  10. Gudigar, A., Chokkadi, S., & U, R. (2016). A review on automatic detection and recognition of traffic sign. Multimedia Tools and Applications, 75(1), 333–364. DOI: 10.1007/s11042-014-2293-7
  11. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778. DOI: 10.1109/CVPR.2016.90
  12. Huang, W., Sun, Q., Yu, A., Guo, W., Xu, Q., Wen, B., & Xu, L. (2023). Leveraging Deep Convolutional Neural Network for Point Symbol Recognition in Scanned Topographic Maps. ISPRS International Journal of Geo-Information, 12(3), 128. DOI: 10.3390/ijgi12030128
  13. Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1), Article 2. DOI: 10.3390/technologies9010002
  14. Jocher, G., Chaurasia, A., Stoken, A., Borovec, J., Kwon, Y., Michael, K., TaoXie, Fang, J. imyhxy, Lorna, Zan Yifu, Wong, C., V, A., Montes, D., Wang, Z., Fati, C., Nadar, J., Laughing, … Jain, M. (2022). ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation. Zenodo. DOI: 10.5281/zenodo.3908559
  15. Jocher, G., Chaurasia, A., & Qiu, J. (2023). YOLO by Ultralytics (Version 8.0.0). https://github.com/ultralytics/ultralytics
  16. Kalervo, A., Ylioinas, J., Häikiö, M., Karhu, A., & Kannala, J. (2019). CubiCasa5K: A Dataset and an Improved Multi-task Model for Floorplan Image Analysis. In M. Felsberg, P.-E. Forssén, I.-M. Sintorn & J. Unger (Eds.), Image Analysis: 21st Scandinavian Conference, Vol. 11482, 28-40. Springer. DOI: 10.1007/978-3-030-20205-7_3
  17. Lin, T. Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L., & Piotr, D. (2014). Microsoft COCO: Common Objects in Context. In D. Fleet, T. Pajdla, B.Schiele & T. Tuytelaars (Eds.), Computer Vision – ECCV 2014, Vol. 13, 740-755. Springer. DOI: 10.1007/978-3-319-10602-1_48
  18. Lim, J.-S., Astrid, M., Yoon, H.-J., & Lee, S.-I. (2021). Small Object Detection using Context and Attention. 2021 International Conference on Artificial Intelligence in Information and Communication, 181–186. DOI: 10.1109/ICAIIC51459.2021.9415217
  19. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. Proceedings of the IEEE/CVF International Conference on Computer Vision, 9992-10002. DOI: 10.1109/ICCV48922.2021.00986
  20. Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., & Xie, S. (2022). A ConvNet for the 2020s. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 11976-11986. DOI: 10.1109/CVPR52688.2022.01167
  21. Loshchilov, I., & Hutter, F. (2017). Decoupled weight decay regularization. arXiv. DOI: 10.48550/arXiv.1711.05101
  22. Mani, S., Haddad, M. A., Constantini, D., Douhard, W., Li, Q., & Poirier, L. (2020). Automatic Digitization of Engineering Diagrams Using Deep Learning and Graph Search. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 176-177. DOI: 10.1109/CVPRW50498.2020.00096
  23. Moutik, O., Sekkat, H., Tigani, S., Chehri, A., Saadane, R., Tchakoucht, T. A., & Paul, A. (2023). Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?. Sensors, 23(2), 734. DOI: 10.3390/s23020734
  24. Padilla, R., Passos, W. L., Dias, T. L. B., Netto, S. L., & da Silva, E. A. B. (2021). A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit. Electronics, 10(3), 279. DOI: 10.3390/electronics10030279
  25. Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Proceedings of the IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137-1149. DOI: 10.1109/TPAMI.2016.2577031
  26. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10684-10695. DOI: 10.1109/CVPR52688.2022.01042
  27. Schmidt, S., Rao, Q., Tatsch, J., & Knoll, A. (2020). Advanced Active Learning Strategies for Object Detection. Proceedings of the IEEE Intelligent Vehicles Symposium. 871–876. DOI: 10.1109/IV47402.2020.9304565
  28. Wang, D., Zhang, J., Du, B., Xia, G. S., & Tao, D. (2023). An Empirical Study of Remote Sensing Pretraining. Proceedings of the IEEE Transactions on Geoscience and Remote Sensing, 61. DOI: 10.1109/TGRS.2022.3176603
  29. Wang, C.Y., Bochkovskiy, A., & Liao, H.Y. (2022). YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. arXiv. DOI: 10.48550/arXiv.2207.02696
  30. Zaidi, S. S. A., Ansari, M. S., Aslam, A., Kanwal, N., Asghar, M., & Lee, B. (2022). A survey of modern deep learning based object detection models. Digital Signal Processing, 126, Article 103514. DOI: 10.1016/j.dsp.2022.103514
  31. Ziran, Z., & Marinai, S. (2018). Object Detection in Floor Plan Images. In: L. Pancioni, F. Schwenker, E. Trentin, (Eds.), Artificial Neural Networks in Pattern Recognition, 383-394. Springer. DOI: 10.1007/978-3-319-99978-4_30
PDF
  • Publication Year: 2023
  • Pages: 877-886

XML
  • Publication Year: 2023

Chapter Information

Chapter Title

A Comparative Study of Deep Learning Models for Symbol Detection in Technical Drawings

Authors

Benedikt Faltin, Damaris Gann, Markus König

DOI

10.36253/979-12-215-0289-3.87

Peer Reviewed

Publication Year

2023

Copyright Information

© 2023 Author(s)

Content License

CC BY-NC 4.0

Metadata License

CC0 1.0

Bibliographic Information

Book Title

CONVR 2023 - Proceedings of the 23rd International Conference on Construction Applications of Virtual Reality

Book Subtitle

Managing the Digital Transformation of Construction Industry

Editors

Pietro Capone, Vito Getuli, Farzad Pour Rahimian, Nashwan Dawood, Alessandro Bruttini, Tommaso Sorbi

Peer Reviewed

Publication Year

2023

Copyright Information

© 2023 Author(s)

Content License

CC BY-NC 4.0

Metadata License

CC0 1.0

Publisher Name

Firenze University Press

DOI

10.36253/979-12-215-0289-3

eISBN (pdf)

979-12-215-0289-3

eISBN (xml)

979-12-215-0257-2

Series Title

Proceedings e report

Series ISSN

2704-601X

Series E-ISSN

2704-5846

143

Fulltext
downloads

128

Views

Export Citation

1,361

Open Access Books

in the Catalogue

2,368

Book Chapters

3,870,371

Fulltext
downloads

4,536

Authors

from 942 Research Institutions

of 66 Nations

67

scientific boards

from 357 Research Institutions

of 43 Nations

1,249

Referees

from 381 Research Institutions

of 38 Nations