About
Piotr Szymański is an assistant Professor at the Department of Artificial Intelligence at the Wrocław University of Science and Technology and a Machine Learning Engineer at Avaya. Professionally involved in data analysis, statistical reasoning, geospatial data science, natural language processing, machine learning and artificial intelligence techniques.
He is an alumni of the Top 500 Innovators program at Stanford University, worked in several institutions over the years incl. Hasso Plattner Institute in Potsdam, Josef Stefan Institute in Ljubljana, University of Notre Dame and University of Technology Sydney. He is the author of http://scikit.ml - a popular python library for multi-label classification, and https://github.com/niedakh/pqdm/ - a parallel processing wrapper for tqdm.
Apart from multi-label classification, Piotr published papers concerning urban data, traffic analysis and bridging the gap between ASR and NLP in spoken language understanding systems. In his free time he is an urban activist in Wrocław - chairing the [http://tumw.pl](Society of Beautification of the City of Wrocław), and a member of Wrocław city [http://brochow.wroclaw.pl](distric council of Brochów).
Recently he became a member of the international group which models coronavirus spread: http://mocos.pl, which publishes regular reports about the state of the pandemic in Poland and also analysis of related phenomena.
Research Interests
Education
MsC in Computer Science
Wroclaw University of Science and Technology · 2008
Publications
OBSR: Open Benchmark for Spatial Representations
2025 · SIGSPATIAL 2025
A Vision for Geo-Temporal Deep Research Systems
2025 · arXiv 2506.14345
Evaluation of Code LLMs on Geospatial Code Generation
2024 · GeoAI@SIGSPATIAL 2024
Why Aren't We NER Yet? Artifacts of ASR Errors in Named Entity Recognition
2023 · ACL 2023
SRAI: Towards Standardization of Geospatial AI
2023 · GeoAI@SIGSPATIAL 2023
Improved DeepFake Detection Using Whisper Features
2023 · INTERSPEECH 2023
Map Diffusion: Text Promptable Map Generation Diffusion Model
2023 · UrbanAI@SIGSPATIAL 2023
Analysis of elevated PM2.5 episodes using the campus air quality sensor network of Wrocław University of Science and Technology
2021 · *3rd Symposium \"\"Air Quality and Health\"\" : book of abstracts, online conference 12-14.05.2021*
hex2vec: context-aware embedding H3 hexagons with OpenStreetMap tags
2021 · *GeoAI 2021 : Proceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, (GeoAI 2021) : Nov. 2nd, 2021, Beijing, China*
gtfs2vec: learning GTFS embeddings for comparing public transport offer in microregions
2021 · *GeoSearch 2021 : Proceedings of the 1st ACM SIGSPATIAL International Workshop on Searching and Mining Large Collections of Geospatial Data, (GeoSearch 2021) : Nov. 2nd, 2021, Beijing, China*
Spatial data mining of public transport incidents reported in social media
2021 · *IWCTS 2021 : Proceedings of the 14th ACM SIGSPATIAL International Workshop on Computational Transportation Science (IWCTS’21), November 2, 2020, Seattle, WA, USA*
Transfer learning approach to bicycle-sharing systems' station location planning using OpenStreetMap data
2021 · *ARIC 2021 : Proceedings of the 4th ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities (ARIC 2021), 2nd Nov 2021, Seattle, WA, USA*
Punctuation prediction in spontaneous conversations: can we mitigate ASR errors with retrofitted word embeddings?
2020 · *Interspeech 2020 : 21th Annual Conference of the International Speech Communication Association, 25-29 October 2020, Shanghai, China*
WER we are and WER we think we are
2020 · *Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2020, 16-20, November, 2020*
Is the best better? Bayesian statistical model comparison for Natural Language Processing
2020 · *2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020 : Proceedings of the Conference, November, 16-20, 2020*
Analiza zmienności stężeń pyłu PM2,5 na wybranych obszarach kampusu Politechniki Wrocławskiej podczas stanu zagrożenia epidemicznego oraz stanu epidemii ogłoszonych z powodu rozprzestrzeniania się wirusa SARS-CoV-2
2020 · *Aktualne trendy w ochronie powietrza i klimatu - kontrola, monitoring, prognozowanie i ograniczanie emisji : praca zbiorowa*
Return on investment in machine learning: crossing the chasm between academia and business
2020 · *Foundations of Computing and Decision Sciences*
Assessing the usefulness of dense sensor network for PM2.5 monitoring on an academic campus area
2020 · *Science of The Total Environment*
scikit-multilearn: a scikit-based Python environment for performing multi-label classification
2019 · *Journal of Machine Learning Research*
Avaya Conversational Intelligence: a real-time system for Spoken Language Understanding in human-human call center conversations
2019 · *Interspeech 2019 : 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 15th-19th 2019.*
Zastosowanie narzędzi GIS w analizie jakości powietrza atmosferycznego na terenie wybranego kampusu uczelni wyższej w Polsce
2019 · *XI Konferencja Naukowa Ochrona Powietrza w Teorii i Praktyce : Zakopane, 22-25 październik 2019 r. : książka poszerzonych abstraktów*
A network science perspective on label dependencies in multi-label classification
2019
Sensor network for PM2.5 measurements on an academic campus area
2019 · *International Conference on Advances in Energy Systems and Environmental Engineering (ASEE19) : Wrocław, Poland, June 9-12, 2019*
Spatio-temporal profiling of public transport delays based on large-scale vehicle positioning data from GPS in Wrocław
2018 · *IEEE Transactions on Intelligent Transportation Systems*
Punctuation prediction model for conversational speech
2018 · *19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 : 2-6 September 2018, Hyderabad, India*
A network perspective on stratification of multi-label data
2017 · *First International Workshop on Learning with Imbalanced Domains: Theory and Applications, 22 September 2017, ECML-PKDD, Skopje, Macedonia*
Is a data-driven approach still better than random choice with Naive Bayes classifiers?
2017 · *Intelligent Information and Database Systems : 9th Asian Conference, ACIIDS 2017, Kanazawa, Japan, April 3-5, 2017 : proceedings. Pt. 1*
Fast and accurate - improving lexicon-based sentiment classification with an ensemble methods
2016 · *Intelligent Information and Database Systems : 8th Asian Conference, ACIIDS 2016, Da Nang, Vietnam, March 14-16, 2016 : proceedings. Pt. 2*
Comprehensive study on lexicon-based ensemble classification sentiment analysis
2016 · *Entropy*
How is a data-driven approach better than random choice in label space division for multi-label classification?
2016 · *Entropy*
Three is more interesting than two :words against publishing methods for sentiment analysis tested only for two classes
2016 · *14th Students' Science Conference : management and algorithms, 22-25 September, 2016.*
Simpler is better? Lexicon-based ensemble sentiment classification beats supervised methods
2014 · *Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2014 : Beijing, China, 17-20 August 2014*
MLG : enchancing multi-label classification with modularity-based label grouping
2013 · *Hybrid artificial intelligent systems : 8th international conference, HAIS 2013, Salamanca, Spain, September, 11-13, 2013 : proceedings*
The analytical psychology of architecture :between subjective expression and objective meaning
2011 · *Anais do ... Congresso Projetar*
