Data Science
Accreditations

Tuition fee EU nationals (2025/2026)
Tuition fee non-EU nationals (2025/2026)
Check here the detailed study plan (in Portuguese only)
Note: There are curricular units that can accommodate international students and can therefore be taught in English, namely Big Data Management, Forecasting Models and Unsupervised Statistical Analysis.
Programme Structure for 2025/2026
Curricular Courses | Credits | |
---|---|---|
1st Year | ||
Bayesian Modelling
6.0 ECTS
|
Scholar Group > Mandatory Courses | 6.0 |
Text Mining for Data Science
6.0 ECTS
|
Scholar Group > Mandatory Courses | 6.0 |
Interdisciplinary Seminar in Data Science
6.0 ECTS
|
Scholar Group > Mandatory Courses | 6.0 |
Business Analytics Fundamentals
6.0 ECTS
|
Scholar Group > Mandatory Courses | 6.0 |
Time Series Analysis and Forecasting
6.0 ECTS
|
Scholar Group > Mandatory Courses | 6.0 |
2nd Year | ||
Project Design for Data Science
6.0 ECTS
|
Scholar Group > Mandatory Courses | 6.0 |
Deep Learning for Computer Vision
6.0 ECTS
|
Scholar Group > Mandatory Courses | 6.0 |
Master Project in Data Science
48.0 ECTS
|
Final Work | 48.0 |
Master Dissertation in Data Science
48.0 ECTS
|
Final Work | 48.0 |
Bayesian Modelling
LO1. Characterize the basic concepts of Bayesian modelling
LO2. Apply Bayesian regression, classification and optimization models to support decision making
LO3. Apply the Bayesian approach to statistical learning
PC1. Bayes Theorem and Bayesian paradigm
PC2. Graphical and hierarchical models
PC3. Bayesian inference
PC4. Bayesian optimization
PC5. Bayesian regression and classification
PC6. Bayesian latent factor models
Students may choose either Evaluation during the semester or Final exam.
Assessment throughout the semester:
- group work with minimum grade 8 (50%)
- individual test with minimum grade 8 (50%)
Approval requires a minimum grade of 10.
EXAM:
The Final Exam is a written exam. Students have to achieve a minimum grade of 10 to pass.
Reich, B. J., S. K. Ghosh (2019), Bayesian Statistical Methods, Boca Raton: Chapman and Hall/CRC
McElreath, R. (2020), Statistical Rethinking: A Bayesian Course with Examples in R and Stan, CRC Press.
Levy, R., Mislevy, R. J. (2016), Bayesian Psychometric Modeling, 1st Edition. Boca Raton: Chapman and Hall/CRC
Kruschke, J. K. (2015), Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. Academic Press / Elsevier.
Durr, O., B. Sick (2020), Probabilistic deep Learning, Manning Publications Co.
Theodoridis, S. (2020),Machine Learning: A Bayesian and Optimization Perspective, Elsevier Ltd.
Martin, O., R. Kumar, J. Lao (2022), Bayesian Modeling and Computation in Python, CRC Press.
Heard, N. (2021), An Introduction to Bayesian Inference, Methods and Computation, Berlin: Springer Cham.
Albert, J., H. Jingchen (2020), Probability and Bayesian Modeling, Boca Raton: CRC Press/Taylor & Francis Group.
Códigos R / python
Slides aulas
Text Mining for Data Science
OA1. Understand the fundamentals and challenges of Text Mining
OA2. Learn techniques for document preparation, cleaning, and representation
OA3. Apply Natural Language Processing methods
OA4. Classify texts using machine learning
OA5. Practical application of techniques in Text Mining
The learning objectives are aligned with a teaching method that combines theory and practice. Students will acquire a solid theoretical foundation in Text Mining, its challenges, and main techniques. Through practical activities and projects, they will develop skills in preprocessing, modeling, classification, and information extraction from texts. By the end of the course, students will be capable of applying Text Mining methods in real-world contexts, using current tools and resources, preparing them to tackle complex problems in the field of text analysis.
Introduction
CP1: Importance of large quantities of text, challenges and current methods
CP2: Unstructured vs. (semi-)structured information
CP3: Obtaining and filtering information, information extraction and Data Mining
Document Representation
CP4: Document pre-processing
CP5: Feature extraction: terms as features
CP6: Term weighting schemes
CP7: Vector space models
CP8: Similarity measures
Natural Language Processing
CP9: Language models
CP10: Morphology and part-of-speech tagging
CP11: Complex structures: syntactic analysis
CP12: Information extraction
Text Classification
CP13: Introduction to statistical machine learning
CP14: Evaluation
CP15: Generative classifiers
CP16: Discriminative classifiers
CP17: Unsupervised learning
CP18: Text Mining Resources
Case Study
CP19: Sentiment analysis
CP20: Topic classification and identification
This course uses only assessment throughout the semester and does not include exams.
Assessment components:
a) TESTS (2 mini-tests: 5% each, final test: 40%), taken during the course period;
b) PROJECT (50%).
The TESTS grade can be replaced by a written test to be taken in the assessment period corresponding to the 1st season, 2nd season or special season (Art. 14 of the RGACC).
The PROJECT grade is limited to the TEST grade + 6 points.
Students may improve their grade in the TESTS component by taking a written test during the assessment period corresponding to the 1st season. Students wishing to do so must inform the teachers as soon as the assessment throughout the semester marks are published.
* Machine Learning for Text (2018). Charu C. Aggarwal. https://doi.org/10.1007/978-3-319-73531- 3
* An Introduction to Text Mining: Research Design, Data Collection, and Analysis 1st Edition (October 11, 2017). Gabe Ignatow, Rada F. Mihalcea. SAGE Publications. https://methods.sagepub.com/book/an-introduction-to-text-mining
* Speech and Language Processing (3rd ed. draft, 2024), Dan Jurafsky and James H. Martin. Conteúdo disponível em: https://web.stanford.edu/~jurafsky/slp3/
* Natural Language Processing for Social Media, Second Edition. Synthesis Lectures on Human Language Technologies. Morgan & Claypool, 2017. Atefeh Farzindar and Diana Inkpen. https://link.springer.com/book/10.1007/978-3-031-02167-1
* Jacob Eisenstein. Introduction to Natural Language Processing. Adaptive Computation and Machine Learning. The MIT Press, 2019. https://mitpress.mit.edu/9780262042840/introduction-to-natural-language-processing/
Interdisciplinary Seminar in Data Science
At the end of this course, the student should be able to:
LO1. Explain the development of Data Science over time.
LO2. Differentiate between applied development (application solution) and fundamental development (research).
LO3. Explain the interrelationship between each interdisciplinary topic covered and Data Science.
LO4. Criticise the indiscriminate use of data, whether personal or not, without respect for the principle of minimisation.
LO5. Justify the advantage of using data science processes and methods in societal and environmental problems.
The specific program contents (PC) may change or be adjusted depending on the availability of guest seminars. However, there are highlevel contents that will have to be addressed in the light of current knowledge, such as:
CP1. History of Data Science.
CP2. Visual and Narrative Information and Perception.
CP3. Cyberlaw in Data Science.
CP4. Ethically Responsible Artificial Intelligence.
CP5. Data Science in Economics and Management.
CP6. Data Science in Health.
S7. Data Science in the Humanities.
CP8. Data Science in Society.
CP9. Data Science for the Future.
As this is a seminar course, there should be no written exam.
Assessment will take place throughout the semester, with students participating in small collaborative working groups (E1), preparing individual seminar sheets (E2), and carrying out research into a problem related to one of the themes covered in the different seminars with (i) a final oral presentation and (ii) a digital infographic element (E3).
Elements E1 and E2 will be assessed by the coordinator, while E3 will be assessed by peers, moderated by the coordinator.
The final mark will be calculated as: 0.3 E1 + 0.3 E2 + 0.4 E3.
Dependente dos temas específicos abordados pelo responsável do seminário semanal.
Voeneky, S., Kellmeyer, P., Mueller, O., & Burgard, W. (Eds.). 2022. The Cambridge Handbook of Responsible Artificial Intelligence: Interdisciplinary Perspectives. Cambridge: Cambridge University Press. Dignum, V. 2019. Responsible Artificial Intelligence: How to Develop and Use AI in a Responsible Way. Springer Publishing Company, Incorporated, 1st edition,. ISBN 3030303705. Conitzer, V., Sinnott-Armstrong, W., Schaich Borg, J., Deng, Y., & Kramer,M. Moral decision making frameworks for artificial intelligence. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), Feb. 2017. doi: 10.1609/aaai.v31i1.11140. URL https://ojs.aaai.org/index.php/AAAI/ article/view/11140. Domingos, P. The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World. Basic Books, 2015. ISBN 978-0465065707. Karachalios, K., Stern, N. & Havens,J. C. White paper - measuring what matters in the era of global warming and the age of algorithmic promises. Measuring What Matters in the Era of Global Warming and the Age of Algorithmic Promises, pages 1–17, 2020 van den Hoven, J. 2019. Design for values and values for design. Information Age, Journal of the Australian Computer Society, pages 4 – 7.
Business Analytics Fundamentals
LO1. At the end of the CU, each student should have acquired the necessary skills to understand how to use big data and perform data analysis to out-compete traditional companies in their industries.
LO2. Must also be able to define and implement analytical reports and dashboards, considering basic ETL processes, advanced analytical modeling and effective data visualization.
LO3. Finally, each student must develop soft skills, including Teamwork and Collaboration, Communication, Agile and Critical Thinking.
P1. Data-driven decision making.
P2. Types of Analytics.
P3. Data processing, modeling and visualization.
P4. Effective Business Presentation / communication; ability to explain complex analytical models and results.
P5. Power BI Analytics Platform.
Assessment throughout the semester:
a) Individual work I - ETL and Analytical Modeling (25%) - OA 1, 2
b) Individual work II – DAX and Data Visualization (45%) - OA 1, 2, 3
c) Online discussion of individual work I and II (30%) - LO 1, 2, 3
The evaluation requires: a) students attend at least 2/3 of classes, and b) a minimum of 10 points in the final classification.
Scale: 0-20 points.
The course has no final exam given its design nature applied to real situations.
Aspin, A., Pro Power BI Desktop: Self-Service Analytics and Data Visualization for the Power User, 2020, 3rd ed. Edition, Apress.,
Deckler, G. & Powell, B., Microsoft Power BI Cookbook: Convert raw data into business insights with updated techniques, use cases, and best practices, 2024, 3 dn Ed., Packt Publishing.
Microsoft, Microsoft Learn Power BI, n.a., Microsoft, https://learn.microsoft.com/en-us/training/powerplatform/power-bi
Albright, S. & Winston, W., Business Analytics: Data Analysis & Decision Making, 2019, 7th Edition, South-Western College Pub.
Berthold, M.R., Borgelt, C., Höppner, F., Klawonn, F. & Silipo, R., Guide to Intelligent Data Science: How to Intelligently Make Use of Real Data, 2020, 2nd Edition, Springer International Publishing.
Knaflic, C. N., Storytelling com dados: um Guia Sobre Visualização de Dados Para Profissionais de Negócios, 2019, Alta Books.
Janicijevic A., Power Query Cookbook, 2021, Packt Publishing.
Groot, R. & Korte, M., The Definitive Guide to Power Query (M), 2024, Packt Publishing.
Adamson C., Star Schema, 2010, McGraw-Hill Education.
Russo M. & Ferrari A., The Definitive Guide to DAX, 2dn Ed., 2019, Microsoft Press.
Russo M. & Ferrari A., DAX Patterns, 2dn Ed., SQLBI Corp.
Knaflic C., Storytelling with Data, 2015, Wiley.
McCandless, D., Knowledge is Beautiful, 2014, William Collins.
Bahga, A. & Madisetti, V., Big Data Science & Analytics: A Hands-On Approach, 2016, VPT.
Time Series Analysis and Forecasting
At the end of this learning unit's term, the student must be able to:
LG1. Recognize and apply the classical time series models;
LG2. Recognize and apply SARIMAX and Prophet models;
LG3. Recognize and apply MLP (multi-layer perceptron) artificial neural network models;
LG4. Recognize and apply Deep Learning algorithms (recurrent neural networks) for time series forecasting /trading.
LG5. Basic programming and computation with R and Python
LG6. Application of the studied concepts: information and value extraction from real-world data.
P1. Artificial Neural Networks (ANN) (2 lectures)
P1.1. Perceptron. Activation function
P1.2. MLP (multi-layer perceptron) and Backpropagation
P1.3. Loss function, learning/training an ANN
P1.4. Regularization, hyperparameter tuning
P2. Time series and sequential data (2 lectures)
P2.1. Basic concepts
P2.2. Trends and seasonality
P2.3. Stationarity, unit root tests, Granger causality
P2.4. ARMA/ARIMA/SARIMAX models
P2.5. Residual assumptions, diagnostic tests
P2.6. Prophet model
P2.7. Forecasting, measuring the forecast accuracy
P3. Deep Learning (4 lectures)
P3.1. Neural networks for time series
P3.2. Recurrent neural networks: RNN, GRU, LSTM,
P3.3. Forecasting: direct, recursive, rolling windows
P4. Programming/computing with Python
P5. Application of the studied concepts: information and value extraction from real-world data (2 lectures)
The periodic evaluation includes the realization of:
a) An individual test (50%).
b) A teamwork (50%).
The periodic evaluation requires that students attend at least 80% of classes. The test covers all topics.
In this type of evaluation, students must achieve a minimum grade of 9.5 in the individual test and 10 in the teamwork. Otherwise, students must take a final exam (minimum approval score: 10).
Ficheiros (slides e scripts) da UC a disponibilizar no e-learning/Fenix
Yves Hilpisch (2018), Python for Finance, 2nd Edition, O.Reilly Media, Inc.
Tarek A. Atwan, (2022), Time Series Analysis with Python Cookbook, Packt Publishing.
Mills, T.C. (2019), Applied Time Series Analysis: A Practical Guide to Modeling and Forecasting, Academic Press, Elsevier Inc.
Brooks, C., (2019), Introductory econometrics for finance, 4nd ed., Cambridge University Press.
Edward Raff, (2022), Inside Deep Learning: Math, Algorithms, Models, Manning Publications Co.
Louis Owen, (2022), Hyperparameter Tuning with Python, Packt Publishing.
James Ma Weiming, (2019), Mastering Python for Finance: Implement advanced state-of-the-art financial statistical applications using Python, 2nd Edition, Packt Publishing.
Juselius, K., (2006), The Cointegrated VAR Model: Methodology and Applications, Oxford University Press.
Project Design for Data Science
OA1. Define and characterize a specific research problem and explain its context.
OA2. Identify research questions that enable the establishment of a state of the art in relation to a concrete research problem.
OA3. Critically evaluate and discuss the results obtained from research and literature review related to the defined research problem and which demonstrates the relevance of the proposed research.
OA4. Identify a dataset that enables research into the response to the defined problem.
OA5. Design a data workflow able to respond to the proposed research questions.
OA6. Communicate research work and its results effectively, through written technical reports and visual and oral presentations.
CP1. Integrity and responsibilities in research.
CP2. Definition of the research topic and field.
CP3. Definition of the study’s subject, research problem, and objectives.
CP4. Conducting a literature review using appropriate methodologies.
CP5. Performing critical analyses of results: content and impact.
CP6. Scientific communication and dissemination methodologies.
This course unit focuses on establishing the initial chapters of the master's dissertation, thus it does not involve a 100% written exam.
Assessment:
(a) an individual presentation of their work (70%),
(b) a deliverable document that includes the definition of the research problem and objectives, context, research questions and objectives, a systematic literature review and discussion of the findings, and the design and planning of the dissertation project (30%).
> Schröer, C., Kruse, F., and Gómez, J. M. (2021). A systematic literature review on applying CRISP-DM process model. Procedia Computer Science, 181:526-534.
> Saltz, J. S. (2021). "CRISP-DM for Data Science: Strengths, Weaknesses and Potential Next Steps," 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, pp. 2337-2344
> Blischak, J. D., Davenport, E. R., & Wilson, G. (2016). A Quick Introduction to Version Control with Git and GitHub. PLOS Computational Biology, 12(1).
> Thomas, C. G. (2021). Research methodology and scientific writing. 2nd edition. Springer.
> Perkel, J. M. (2018). A toolkit for data transparency takes shape. Nature, 560(7719), 513–515.
> ALLEA - The European Code of Conduct for Research Integrity, European Union, URL: https://allea.org/code-of-conduct/
> RGPD - Regulamento Geral de Proteção de Dados, URL: https://gdpr-info.eu OU https://gdpr.eu
> Artificial intelligence act, URL: https://www.europarl.europa.eu/RegData/etudes/BRIE/2021/698792/EPRS_BRI(2021)698792_EN.pdf
> Material de leitura a determinar ao longo das aulas e de acordo com os temas a tratar.
Deep Learning for Computer Vision
O1: To know the basic digital image formation process
O2: To represent an image in different color spaces
O3: To perform typical image processing operations
O4: To extract low-level characteristics from an image
O5: To implement an automatic learning system based on classic algorithms for image content classification
O6: To know the typical architecture of a convolutional neural network (CNN) and to understand how it works
O7: To solve a medium complexity image classification problem CNNs
O8: To apply knowledge transfer and fine-tuning methodologies based on pre-trained CNNs
O9: To use deep learning algorithms for image objects identification
O10: To know deep learning algorithms for automatic generation of multimedia content
O11: To manipulate images using the OpenCV library
O12: To use the Tensorflow library to develop machine learning applications
C1 - Image acquisition and representation
C2 - Image operations
C3 - Extraction of image features
C4 - Introduction to machine learning
C5 - Artificial neural networks
C6 - Convolutional neural networks
C7 - Knowledge transfer
C8 - Network architectures for detecting and identifying image objects
C9 - Network architectures for automatic content generation
Given the imminently practical nature of the course there is no exam assessment modality - there are only assessment modalities carried out along the semester.
Modality A (requires attendance to at least 60% of the classes)
- Exercises (30%) - group work, includes participation on class activities (10%) and two work assignments (challenges) delivered online (10% each);
- Test (30%) - individual;
- Project (40%) - group work, but evaluated individually; includes report and oral discussion.
Modality B
- Test (45%) - individual;
- Project (55%) - individual or in a group, but evaluated individually; includes report and oral discussion.
The "Project" component requires a minimum of 10 (out of 20) values, regardless of the assessment modality.
The "Test" component is carried out at the end of the classes period; if a student misses the test with justified reason, he/she can take a new test during the exam’s season.
The grade of the "Project" component is subject to an oral discussion. This oral discussion may impose a limit on the project's grade if the performance shown by the student is below the quality of the delivered project; the student may fail if performance shown in the discussion is not sufficient.
There is no process for grade improvement.
The evaluation process in the special season is identical to modality B, but the "Project" must be carried out individually.
J. Howse, J. Minichino, Learning OpenCV 4 with Python 3, 3rd Edition, Packt Publishing, 2020, -, -
M. Elgendy, Deep Learning for Vision Systems, Manning, 2020, -, -
Tomás Brandão, Materiais da UC disponibilizados na plataforma de e-learning, 2024, -, -
M. Nixon, A. Aguado, Feature Extraction and Image Processing for Computer Vision, 4th Edition, Academic Press, 2019, -, -
I. Goodsfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, 2016, -, -
Vários, Tutoriais e documentação da bibliotecas OpenCV, -, -, https://opencv.org/
Vários, Tutoriais e documentação da biblioteca Tensorflow, -, -, https://www.tensorflow.org/
R. Szeliski, Computer Vision: Algorithms and Applications, 2nd Edition, Springer, 2021, -, https://szeliski.org/Book/
F. Chollet, Deep Learning with Python, 2nd Edition, Manning, 2021, -, -
Master Project in Data Science
Learning goals (LG):
LG1- Independent scientific thought and originality
LG2- Scientific skills
LG3- Logical coherence and scientific argumentation
LG4- Quality of the presentation
Syllabus contents (SC):
SC1- Formulate the starting question
SC2-Identify the relevant literature and elaborate a theoretical and empirical revision
SC3-Formulate the research problem and the hypotheses
SC4- Design a study to test the hypotheses
SC5- Carry out the study
SC6-Analyse and interpret the results
SC7-Elaborate the Master Project plan
SC8-Write the Master Project
A panel of judges in public tests will assess the Master Project, after the supervisor's approval of its
conclusion and quality to be presented in public tests. Assessment will be based on the scientific merit of the study and on its theoretical and methodological adequacy.
G. Garson (2001), Guide to Writing Empirical Papers, Theses, and Dissertations, Marcel Dekker Inc N. Bui, Yvonne (2014). How to write a Master's Thesis, Sage Publications, Inc.
Punch, F. Keith (2016), Developing effective research proposals, Sage Publications.
Punch, F. Keith (2016), Developing effective research proposals, Sage Publications.
Master Dissertation in Data Science
Learning goals (LG):
LG1- Independent scientific thought and originality
LG2- Scientific skills
LG3- Logical coherence and scientific argumentation
LG4- Quality of the presentation
Syllabus contents (SC):
SC1- Formulate the starting question
SC2-Identify the relevant literature and elaborate a theoretical and empirical revision
SC3-Formulate the research problem and the hypotheses
SC4- Design a study to test the hypotheses
SC5- Carry out the study
SC6-Analyse and interpret the results
SC7-Elaborate the dissertation plan
SC8-Write the dissertation
A panel of judges in public tests will assess the dissertation, after the supervisor's approval of its conclusion and quality to be presented in public tests. Assessment will be based on the scientific merit of the study and on its theoretical and methodological adequacy.
BibliographyG. Garson (2001), Guide to Writing Empirical Papers, Theses, and Dissertations, Marcel Dekker Inc N. Bui, Yvonne (2014). How to write a Master's Thesis, Sage Publications, Inc.
Punch, F. Keith (2016), Developing effective research proposals, Sage Publications.
Punch, F. Keith (2016), Developing effective research proposals, Sage Publications.
Recommended optative
The identification of optional courses is subject to an analysis of the prior competences of the admitted candidates, in the process of analysing the applications, with reference to the following training plans:
Personalized plan A - for students with prior skills in Data Science:
> Knowledge and Reasoning in Artificial Intelligence
> Mathematical Methods in Machine Learning
> 2 Free optional courses
Personalized plan B - for students with no previous skills in Data Science:
> Unsupervised Statistical Learning
> Big Data Processing and Modelling
> 1 Free optional course
Personalized plan C - for students without previous skills in Data Science and Programming:
> Unsupervised Statistical Learning
> Big Data Processing and Modelling
Other alternative training programmes may be identified depending on the candidate's prior competences.
Objectives
To provide comprehensive training in Data Science, in line with current trends and market needs and the emerging lines of research
To provide knowledge and skills in advanced data analysis, especially for dealing with big data and for extracting knowledge from unstructured data (text and image);
To provide applied training aimed at developing skills and competencies in handling the latest technological tools for data science;
Train skilled professionals in the current state of the art in data governance, attribute selection and engineering, and the construction and use of learning models suitable for different data regimes and formats.
Accreditations
