Matematiska vetenskaper // Mathematical Sciences
Använd denna länk för att länka till samlingen:
Studerar matematikens strukturer och utvecklar dem för att bättre förstå vår värld, och till nytta för forskning och teknisk utveckling.
De matematiska vetenskaperna utforskar tankens grundläggande begrepp och lagar. De är oumbärliga för modern naturvetenskap och teknik. Även inom andra vetenskaper spelar matematisk och statistisk metodik en alltmer framträdande roll. Matematik är också en vetenskap i sig själv och grundforskning i matematik är en förutsättning för dess många tillämpningar. Institutionen är gemensam för Chalmers tekniska högskola och Göteborgs universitet.
För forskning och forskningspublikationer, se https://research.chalmers.se/organisation/matematiska-vetenskaper/
Studies mathematical structures, developing them to better understand our world and to benefit from research and technological development.
The mathematical sciences are fundamental and indispensable to a large part of modern science and engineering. Progress in other disciplines is often linked to an increased use of mathematics. Mathematics is also a subject in itself, and fundamental research is a necessary condition for its many applications.
The Department is joint for the Chalmers University of Technology and University of Gothenburg.
Studying at the Department of Mathematical Sciences at Chalmers
For research and research output, please visit https://research.chalmers.se/en/organization/mathematical-sciences/
Browse
Browsar Matematiska vetenskaper // Mathematical Sciences efter Program "Data science and AI (MPDSC), MSc"
Sökresultat per sida
Sortera efter
- PostA machine learning approach for predicting bacteria content in drinking water(2023) Eric, Jonsson; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Axelson-Fisk, Marina; Dannélls, Dana; Cahn, JacobThe current method for finding whether drinking water contains bacterial contamination is a very slow process and it can take up to eight days before the results are obtained. During this time, a significant proportion of the population has potentially obtained diseases from contaminated water. As a mitigating action, this thesis aimed to understand if machine learning could be a promising method for forecasting the bacteria level and how such a model could be designed. The project was performed in association with a case company called Nocoli, which is spun out of Chalmers Ventures and desired an examination of the potential implementation. A literature review including eight different case studies of how machine learning was previously applied in the field and three semi-structured interviews with industryspecific stakeholders were conducted. The research methodology originated from the fact that both an overview of the current industry situation as well as machine learning applicability was required. Moreover, by using an extracted theory of machine learning algorithms for different objectives, the case studies were evaluated to find patterns that could meet the case companys demands. It was found that machine learning is promising and desired in the industry to improve current operations. The Random Forest algorithm was recommended in the initial stage due to its trade-off between accuracy and interpretability. Data on bacterial content and other factors including weather was intended as the data source. The recommendation included a 3:1:1 split between training-, validation-, and test sets as well as using a recursive feature selection algorithm. Additionally, a combination of error measures was recommended including Mean Squared Error with an out-of-bag supplement to reduce overfitting. Furthermore, although no data could be obtained to evaluate the recommended model, it was concluded that machine learning could have a positive impact on today’s approach and contribute to improved water management and safety by enabling reliable forecasts.
- PostAdaptive Radar Illuminations with Deep Reinforcement Learning: Illumination Scheduling for Long Range Surveillance Radar with the use of Proximal Policy Optimization(2023) Sandelius, Samuel; Ekelund Karlsson, Albin; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Helgesson, Peter; Andersson, AdamA modern radar antenna can direct its energy electronically without inertia or the need for mechanically steering. This opens up several degrees of freedom such as transmission direction and illumination time, and thus also the potential to optimise operation in real-time. Long range surveillance radars solve the trade-off between searching for new targets and tracking known targets. This optimisation is often rule-based. In recent years, Reinforcement Learning (RL) Algorithms have been able to efficiently solve increasingly difficult tasks, such as mastering game strategies or solving complex control tasks. In this thesis we show that reinforcement learning can outperform such rule-based approaches for a simulated radar.
- PostCausal effect of carbon footprint calculators(2022) Hultén, Louise; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Picchini, Umberto; Schauer, MoritzThis master thesis aims to answer whether theory on causality and multivariate time series are relevant tools for questions that might arise in the context of different tracking apps. The context is the mobile application Svalna, which is a research-based carbon calculator designed to help people track and reduce their emissions. It has been shown that information provision can impact behavior, so the central question is whether using the Svalna application impacts the users consumption. I introduce a statistical approach to analyse multivariate time series like those gathered through Svalna. I create a data generation model to test the suggested statistical model. As an intermediate check, the model is used to evaluate a data set from Svalnas users. I conclude that the mechanisms of the developed models function in well-behaved data and the model should be seen as a intermediate step towards a model to analyze real data from Svalna. I think it is a useful approach that can contribute to understanding behavioural change and contribute to better app design.
- PostCritical Event Prediction in Logs at Customer Network(2022) Hajizada, Elmar; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Jonasson, Johan; Jonasson, Johan; Akrami, RozitaImplementing effective maintenance prognosis for Radio units at Ericsson can result in a number of benefits, including better system safety, improved operational reliability, longer equipment lifespan, and lower maintenance costs. Preventive investigations and repairs on the hardware and software level can be done to avoid the radio unit from failing by forecasting whether or not the radio unit will have an alarm in the near future. The goal of this thesis was to use multiple logs taken from a radio unit to predict whether an alarm would occur in the next one to nine days. The log file contents have been divided into chunks using different approaches like expanding window, independent chunks and time interval chunks where each chunk labeled according to timestamp of the alarm. Ericsson has used a combination of verdicts (features that are defined by subject matter experts) to extract the best features from the log files. This rule-based approach is inefficient since it requires modification of the script using expert knowledge when there is a change in the design of the hardware. The purpose of this thesis project was achieved using data-driven NLP approaches including log parsers and word embeddings. An independent chunks approach with Drain log parser using concatenated bag-of-words representations for each log file fitted on the Xgboost model outperformed other combination of log parsers and word embeddings. LSTM model was used with 1 day interval chunks to see if the complex sequential model can achieve a sufficient score. Experiments using complex sequential model, such as the LSTM many-to-many model with doc2vec embedding, have shown shown that they can predict alerts before they occur. All the tested models were evaluated using cross-validation. The Xgboost model with the independent chunks approach using Drain log parser and BOW embedding achieved an average F1-score of 0.873, LSTM model with time interval chunks approach using doc2vec embedding achieved average 0.853 F1-score across shifting time periods from one to nine days.
- PostDe-identification of Swedish medical chat messages with transformers(2022) Arvidsson, David; Gerle, William; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Axelson-Fisk, Marina; Dannélls, DanaHealthcare in Sweden is becoming more digital and even though new technology could enable improved healthcare it also presents risks. In this thesis, which is conducted together with Visiba Care Sweden AB, data security and privacy risks are of special interest. Visiba Care offers a virtual care platform, where it is possible for patients and healthcare professionals to chat. If chat messages could be de-identified, they could be stored and used to improve healthcare for their patients. The de-identification topic is widely studied within machine learning, however the research on Swedish medical corpora is limited, specifically when considering text corpora which consist of chat messages. Using KB-BERT for named entity recognition (NER), this thesis investigated if it was possible to reach equal performance on Swedish medical chat messages as the current state-of-the-art NER model reaches on Swedish electronic patient records. Furthermore, the thesis investigated the importance of training data size within this domain and also if a KB-BERT NER model trained on rule-based annotated data could reach higher performance than the rules it had been trained on. Data was collected from two of Visiba Cares customers. The annotation process followed strict annotation rules, where firstly a rule-based script annotated the data before a manual review was conducted. KB-BERT was accessed through the open source library Hugging Face and the hyperparameters were tuned using random search to optimize performance. Furthermore, the decision threshold was tuned to improve recall since this metric was considered to be more important than precision in the given domain. The results showed that it was possible to exceed current state-of-the-art performance and also that using one class for all entities led to further performance increase. Regarding training data size, the results showed that not only size is important, but also the format of the entities. Lastly, we failed to create a KB-BERT model trained on rule-based annotated data which reached higher performance than the rules it had been trained on. A potential explanation to this could be that the rule-based script did not produce annotations of high enough quality.
- PostDecentralized Deep Learning under Distributed Concept Drift: A Novel Approach to Dealing with Changes in Data Distributions Over Clients and Over Time(2023) Klefbom, Emilie; Örtenberg Toftås, Marcus; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Modin, Klas; Dubhashi, DevdattIn decentralized deep learning, clients train local models in a peer-to-peer fashion by sharing model parameters, rather than data. This allows collective model training in cases where data may be sensitive or for other reasons unable to be transferred. In this setting, variations in data distributions across clients have been extensively studied, however, variations over time have received no attention. This project proposes a solution to address decentralized learning where the data distributions vary both across clients and over time. We propose a novel algorithm that can adapt to the evolving concepts in the network without any prior knowledge or estimation of the number of concepts. Evaluation of the algorithm is done using standard benchmarks adapted to the temporal setting, where it outperforms previous methods for decentralized learning.
- PostEmergence of Agency from a Causal Perspective(2024) Ånestrand, Alvin; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Lundh, Torbjörn; Häggström, Olle; Fox, James; Everitt, TomCausal models of agents and agentic behavior allows for safety analysis of machine learning systems. Understanding how goal-directed behavior emerges from adapting to an environment is however non-trivial. This thesis addresses the gap between theoretical models and real-world implementations of machine learning systems, though a framework that formalizes the connection between system dynamics and goal-driven behavior. This thesis introduces novel probabilistic graphical models for describing system dynamics involving learning agents, based on dynamic bayesian networks, which allows for a flexible representation of causal relationships in the training environment. To analyze goal-directed behavior that emerges from interacions between agents and the environment, the thesis also introduces temporally abstracted models. Such a model captures the dynamics of a system after the learning process has converged, derived from a model of the learning process. A temporally abstracted model describes potential outcomes involving equilibria between agents and the environment, and can under certain conditions be viewed as a model of goal-directed behavior in the system.
- PostExploring Future Pricing Strategies for Electric Heavy-Duty Road Freight Services(2023) Hegardt, Johan; Hedenblad, Leonard; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Axelson-Fisk, Marina; Lindberg, CarlThe road freight industry is undergoing a transition with the increase in demand for electric trucks and increased digitalisation. The pricing strategies in this industry are still underdeveloped and need reformation. This thesis project aims to: 1. Investigate the implications of various pricing strategies for heavy-duty road freight services in a digitalised, electric-only urban environment, and 2. Provide insights into the development of effective pricing strategies that balance profitability and risk while accounting for the challenges of a future environment with new technologies, cost structures, electrification, and digitalisation. A methodology that incorporates Multi-Objective Robust Optimisation (MORO) and scenario analysis to identify robust pricing policy alternatives that can withstand different stochastic realisations of both deep uncertainties and well-characterised uncertainties was used. The methodology uses EMA (Exploratory Modeling and Analysis) and EMA Workbench as computational modeling tools to analyse complex systems. The methodology section outlines the research design used to achieve the research objectives. A conceptual XLRM model of the system, with relevant pricing levers and uncertainties, was developed through a literature review and expert opinions from the case company that was collaborated with, which was then translated into a computational model using EMA Workbench. Exploratory research using scenario analysis and feature scoring was conducted to assess risks and benefits associated with each pricing strategy, and sensitivity analysis was used to identify parameters with the greatest impact on outcomes of interest. The results of the study show that the methodology incorporating MORO and scenario analysis can be used to explore pricing strategies in systems of deep uncertainty. 12 optimal pricing policies were suggested and sensitivity analysis was used to identify features with the greatest impact on outcomes of interest. The study provides insights into potential risks and benefits associated with different pricing strategies in a transportation system characterised by deep uncertainty. The study concludes that there is no one-size-fits-all pricing policy, there are best performing policies depending on a company’s goals and uncertainties. The 12 optimal pricing policies were divided between dynamic pricing policies, which are pricing each customer individually, flat per km pricing policies, which are setting a fixed price per km for all customers, and flat per month pricing policies, which are setting the same price for each customer. Two of the dynamic pricing policies were found as top-performers, while the only selected flat per month approach seems to be suitable for maintaining predictability of profits and cash flows along with maximising market share and capacity utilisation rate, rather than maximising total profit. Computational models like the MORO approach can be used to explore pricing strategies in deep uncertainty, but decision makers should be cautious of the assumptions and parameters of the model. Future research should explore alternative methodologies and consider behavioral mechanisms in pricing strategies. Overall, this report provides valuable insights into decision making on pricing strategies for heavy-duty electric road freight under deep uncertainty, i.e., in which sort of scenarios different pricing strategies performs optimally and when certain pricing strategies should be avoided.
- PostExploring Supervision Levels for Patent Classification(2022) van Hoewijk, Adam; Holmström, Henrik; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Axelson-Fisk, Marina; Dannélls, DanaMachine learning can help automate monotonous work. However, most approaches use supervised learning, requiring a labeled dataset. The consulting firm Konsert Strategy & IP AB (Konsert) sees great value in automating its task of manually classifying patents into a custom technology tree. But the ever-changing categories leaves a pre-labeled dataset unavailable. Can other forms of supervision be used for machine learning to excel without extensive data? This thesis explores how weakly supervised, semi-supervised, and supervised learning can help Konsert to classify patents with minimal hand-labeling. Furthermore, what effect class granularity has on performance is explored alongside whether or not using patents’ unique characteristics can help. Two existing state-of-the-art methods at two supervision levels are employed. Firstly, LOTClass, a keyword-based weakly supervised approach. Secondly, MixText, a semi-supervised approach. We also propose LabelLR, a supervised approach based on patents’ cooperative patent classification (CPC) labels. Each method is tested on all granularity levels of a technology tree provided by Konsert alongside a combined ensemble of the three methods. MixText receives all unlabeled patent abstracts together with the same ten labeled documents per class LabelLR receives. LOTClass on the other hand receives the unlabeled abstracts along with class keywords. Results reveal that the small training dataset of around 4 200 patents leaves LOTClass struggling while MixText excels. LabelLR outperforms MixText on the rare occasion when the CPC labels and the classifications closely match. The ensemble proves more consistent than LabelLR but only outperforms MixText on some granular classes. In conclusion, a semi-supervised approach appears to be the best balance of minimal manual work and classification proficiency reaching an accuracy of 60.7% on 33 classes using only ten labeled patents per class.
- PostImproving Algorithmic Text Moderation via Context-Based Representations of Word Semantics(2021) Nordén, Felix; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Jonasson, Johan; Johansson, FredrikAbstract Reliable text moderation requires proper domain knowledge. With scaling requirements increasing as platforms of the Internet grow larger and larger, the prevalence of algorithmic text moderation has increased with the intention to alleviate, or even replace, its manual counterpart. Nonetheless, these algorithm-based solutions are harder to interpret, evaluate, and risk being biased in their decision making, resulting in more rigid and error-prone behavior when changes in context end up shifting the semantics of the text itself. To solve these shortcomings, this thesis presents an approach that learns semantic nuances within shorter pieces of text when given a related context represented by various layers of information. For this purpose, the sentence transformer architecture is employed which jointly learns embeddings of the short-form text and its context. The embeddings are used as input to a Log-loss optimized, fully-connected network to classify the appropriacy of the text. Furthermore, the thesis investigates the tradeoff between gained performance and added time- and implementation complexity for each additional layer of information. The approach is evaluated on chat data from Twitch – a live-streaming service – where the related context for each message is built up incrementally; first by introducing a layer of stream metadata and then augmenting the stream metadata by introducing a layer of related game metadata provided by IGDB – the Internet Game Database. From the results, the approach demonstrates that representing a context using both stream- and game metadata has a significant impact on the performance; yielding an F1 score of 0.37 compared to 0.18 and an AUROC score of 0.63 compared to 0.45 of the best-performing baseline. Furthermore, a linear time complexity dependence is identified on the number of sentences to embed per datapoint, causing a forward pass to take at worst 78 ms. per datapoint. With this, it is concluded that contextual information is able to improve predictive performance for algorithmic text moderation on shorter pieces of text. Additionally, exploring contextual relevance of data is easy when using sentence transformers, albeit with a linear growth in time complexity.
- PostIncorporating Interior Property Images for Predicting Housing Values(2024) Gortzak, Adrian; Ulusoy, Nedim Can; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Särkkä, Aila; Malekipirbazari, MiladThe property valuation process for the real estate market is essential for predicting a fair market value. This process is traditionally carried out by brokers, including inspecting and assessing the subject property to find comparable sales for comparative market analysis (CMA). Meanwhile, an automated valuation model (AVM) can help achieve an autonomous version of this process, which speeds up the process but lacks some of the inputs that a manual assessment provides. AVMs have difficulty considering more subjective architectural qualities, such as beauty, stability, and utility, due to the difficulty of quantifying these aspects objectively. New advancements in Visual Transformers (ViT), self-supervised learning and Contrastive Language- Image Pre-training (CLIP) technologies have shown favourable improvements in the field of computer vision. Therefore, this study explores the potential improvements of these new techniques within the visual feature extraction task to enhance the AVMs from interior images. By applying ViTs as binary classifiers, clusters, and textual descriptions matching, we aim to enrich the feature extraction process for a property valuation model in the region of Uppsala County, Sweden. Our findings show modest enhancements in the AVM’s performance, which align with prior studies, but also highlight that these new technologies can extract more detailed features compared to previous methods. Furthermore, they demonstrate the potential for these technologies to capture more comprehensible architectural qualities from images, which could significantly assist brokers in the valuation process.
- PostInvestigating a Byzantine Resilient Framework for the Adam Optimizer(2023) Fabris, Basil; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Ringh, Axel; Farhadkhani, Sadegh; Ringh, AxelOver the past few years, the utilization of Machine Learning has experienced tremendous growth across various domains, ranging from engineering to marketing. This widespread adoption of Machine Learning has been made possible by advancements in hardware, which have facilitated the training of increasingly large machine learning models. However, these models have given rise to larger datasets and raised concerns regarding data safety and privacy. To address these challenges, Distributed Machine Learning has emerged as a promising solution. By training models locally on participants’ devices, Distributed Machine Learning enhances privacy as raw data remains on the respective devices, while also reducing the need for specialized and novel hardware, as most of the computation takes place on participants’ devices. Nonetheless, due to the lack of control over participants, Distributed Machine Learning is susceptible to attacks carried out by misbehaving (byzantine) participants. This research introduces two Adam-based optimization frameworks for Distributed Machine Learning. Both frameworks are evaluated through empirical analysis using homogeneous and heterogeneous datasets, and their performance is assessed against multiple state-of-the-art attacks. Additionally, we present preliminary evidence of convergence for DRA (Distributed Robust Adam) on homogeneously distributed data.
- PostInvestigation Of Phylogenetic Relations Using Graph Data Science Algorithms(2021) Rahavachari, Ankita; Subramanian, Guru Prakash; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Axelson-Fisk, Marina; Wang, HaoDriven by the vast amount of fast-growing biological databases, a total of 2.1 million diverse species have been categorized within the NCBI taxonomy database either by DNA, RNA, protein, or genome sequences. This thesis focuses on performing a comprehensive analysis of the classified taxonomic branches and nodes in the taxonomy database through utilizing various graph algorithms. By converting these taxonomy data into a Neo4j database, a super graph with 2,121,053 unique branches and 2,323,131 intermediate and end nodes was obtained in a rooted tree structure. In contrast to the classic Linnaean system with eight major ranks (from domain to species), there are 37 additional taxonomic ranks that have been used in describing the complicated phylogeny of the accumulated species. Surprisingly, nearly 10% of the taxonomic nodes are found with a rank either "norank" or "clade" that remain unclassified and await for systematic assignment. In addition, incomplete investigation of skipping cases of taxonomic ranks revealed thousands of lineages that lack one more major rank. They are deviated from the classic taxon hierarchy defined in the Linnaean system, which appears lagging behind the pace of current biological advancement and should be revisited for upgrading. Finally, a bioinformatic tool for estimating phylogenetic distance between any two given organisms was developed and provided with a graphical interface for user exploration.
- PostMachine Learning-based Lane-level Localization(2023) Udayakumar, Amogha; Sundararaman, Bragadesh Bharatwaj; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Axelson-Fisk, Marina; Axelson-Fisk, Marina; Beauvisage, Axel; Fu , JunshengIn autonomous driving, for the vehicle to make a decision on its own it has to have precise knowledge of its location (lane) with respect to its environment. This problem of determining the lane on which the vehicle is travelling is called Lane-Level Localization (LLL). HD maps with localization algorithms are used to solve the problem of Lane-Level Localization. However, during the initialization phase, there is uncertainty in the confidence of the lane in which the vehicle is driving. Our thesis aims to overcome this problem by using the Multi-Hypothesis Tracking (MHT) approach. Multi-Hypothesis Tracking has two parts which are tracking multiple hypotheses and inference of the correct hypothesis by eliminating the wrong hypotheses. Our thesis focuses on the latter part which can be solved using the early classification of time series technique. The early classification of time series technique tends to make an early and accurate classification of the hypothesis by rejecting wrong hypotheses based on the class probabilities assigned to different hypotheses and using multi-objective optimization to find a trade-off between earliness and accuracy. Our model produced an accuracy of 99.053% with an earliness of 0.109.
- PostPENS: Leveraging Data Heterogeneity in Federated Learning(2021) Onoszko, Noa; Karlsson, Gustav; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Gebäck, Tobias; Schauer, Moritz; Listo Zec, EdvinFederated learning (FL) is a decentralized machine learning technique where training is done cooperatively by exchanging model weights or gradients instead of sharing the raw data between the cooperating devices (clients). Classical FL algorithms such as federated averaging work best in the special case when the data is IID over clients. In this work, we address the problem of data heterogeneity in federated learning. We propose a decentralized federated learning (DFL) algorithm termed Performancebased Neighbour Selection Federated Learning Algorithm (PENS), that effectively leverages the data heterogeneity over clients. PENS is a cooperative communicationbased algorithm where clients communicate with other clients that have a similar data distribution. Specifically, model performance is used as a proxy for data similarity as no raw data is allowed to be shared among clients. Experiments on the CIFAR-10 dataset show that this communication scheme results in higher model accuracies than if clients communicate randomly with each other. The method is robust for different numbers of participating clients as long as the local datasets are sufficiently large.
- PostPharmaceutical assay search with AI(2024) Alladin, Ali; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Jonasson, Johan; Jonasson, JohanRetrieving historical assay data in pharmaceutical research is often restricted by reliance on specific metadata, overlooking the contextual information in associated protocol documents. This thesis investigates the potential of utilizing these plain English protocol documents alongside Natural Language Processing (NLP) techniques to implement semantic search for assays. A baseline TF-IDF model and the Transformer models BERT, SBERT, and Longformer were used to get embeddings of protocol documents from a corpus of historical protocols. Their performance in retrieving relevant historical protocols was evaluated based on key technical criteria, where the TF-IDF models and BERT using the chunking technique showed the best results. However, limitations in the evaluation scope introduce some uncertainty to the findings, highlighting the need for more rigorous validation. Nevertheless, the conclusions suggest that integrating NLP-driven semantic search systems could reduce the time and manual effort required for assay retrieval, even though the current approach may need further refinement for practical application. These insights are a promising foundation for developing AI-powered search systems used for pharmaceutical texts.
- PostPlaytesting Match 3 Games with PPO(2023) Malec, Stanislaw; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Andersson, Adam; Haghir Chehreghani, MortezaThe training of proximal policy optimization agents with action masking on stochastic match-3 environments is explored in this thesis. A performant, feature-rich match-3 simulator is developed, and experiments demonstrate improved performance over a random policy on both seen and unseen levels. Furthermore, the best generalization performance is achieved when training is done by sampling levels from a subset of levels.
- PostProbabilistic Calibration in a Few-Shot Domain Adaptation Setting(2024) Adamsson, Oscar; Röst, Jonas; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Modin, Klas; Modin, KlasThis thesis investigates the application of probabilistic machine learning models within a Few-Shot Domain Adaptation (FSDA) setting to address covariate shifts induced by new operational conditions for electric trucks. By leveraging datadriven methods instead of the truck’s physical properties, the thesis assesses the characteristics and robustness of different machine learning models in predicting energy consumption under various new conditions. The study focuses on scenarios involving uni-, bi-, and multivariate covariate shifts posed by colder temperatures, a new route type, and a new vehicle manufacturer. Utilizing real-world data from electric truck drives, the models are trained on source data, adapted using limited target domain data, and assessed for their probabilistic calibration in the target domain. The findings indicate that out of the two baseline models, Ridge regression models, the source-trained baseline model performs well under simpler shifts but struggles with multivariate shifts where the target-only baseline model excels given sufficient target domain data. Hierarchical Bayesian linear regression shows high adaptability when covariate shift affects hierarchical levels of the model. Gaussian process regression improves comparatively well with adaptation. However, the results indicate a possible sensitivity to kernel selection. Bayesian neural networks face challenges with prediction mean accuracy and high sensitivity to individual samples, further research is needed to determine the model’s feasibility in a FSDA setting. These insights provide valuable guidance for fleet management companies in improving decision-making and operational efficiency under new driving conditions through accurate probabilistic energy consumption modeling.
- PostSolving Problems, One Role at a Time(2023) Dunér, Felix; Johansson, Eric; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Axelson-Fisk, Marina; Dannélls, Dana; Musial, MariuszFor large companies, leveraging internal knowledge and existing information within the organization has proved to be difficult for several reasons. In this thesis, which is conducted in collaboration with Ericsson, an attempt to facilitate the extraction of internal knowledge is made, more specifically by matching new issues that employees face with pre-existing, solved ones. The issues are represented by so-called ‘support tickets’ and partly consist of manually entered text where the user describes the problem. The support process could be optimized by automatically identifying what kind of issue the user experience. This study aims to investigate if it is possible to extract semantic information from the text contained in support tickets through semantic role labeling (SRL), and leverage that information to match similar issues related to Ericsson’s cloud infrastructure branch. SRL is often used for information extraction and question-answering, but not in a technical domain. Two pre-trained SRL models were tested: one based on FrameNet and the other based on PropBank. Eventually, the FrameNet model was used throughout the thesis. After initial preprocessing and standardization of technical jargon, pre-trained stateof- the-art (SOTA) models were used to extract semantic information, and visual analysis and overall statistics supported the idea that they could identify relevant targets in sentences and populate frames with roles accordingly. The information yielded through SRL allowed for new ways of representing the support tickets. However, further experiments with topic modeling and classification indicated that the information produced by the FrameNet SRL model was not useful for grouping support tickets according to the categorizations provided by Ericsson. It is suggested that the FrameNet model may be too general for the specific context and that customization of the semantic framework may be a possible solution. It is also noted that the categorizations used as similarity proxies for the support tickets may be based on information outside of the text used to represent the support tickets. Even though the semantic information yielded through SRL did not improve the ability to match similar support tickets in this case, we firmly believe that these features can be helpful. Since the semantic frames provide information otherwise not present in the text, they should be able to enrich the representation.
- PostSparse Time Series Demand Forecasting for Intermittent Availability(2023) Helgesson, Oscar; Laszlo, Norbert; Chalmers tekniska högskola / Institutionen för matematiska vetenskaper; Ringh, Axel; Carlsson, EmilThis thesis addresses the challenge of forecasting sales for individual perishable markdown products using historical sales data and other relevant features in the form of time series. The target time series, i.e., daily sales of marked-down units of a certain product, is intermittent, sparse, and highly irregular, and sales can only occur if the marked-down product is available. To solve the problem, various methods were evaluated, ranging from well-established statistical models to newer deep learning-based models. This thesis proposes an interpretable novel method that improves the Temporal Fusion Transformer model with cluster encodings by applying random convolutional kernel transformations to time series. The study found that the compared deep learning models outperformed the baseline statistical models, particularly the RNN and Temporal Fusion Transformer. The novel approach of clustering the markdown series based on markdown features showed no significant change in performance regarding day-to-day prediction. However, it did show a significant improvement in multi-horizon aggregated predictions. Moreover, using clustering resulted in decreased time in training the models. Overall, the results suggest that deep learning models and the Temporal Fusion Transformer with added cluster encodings are promising models for predicting intermittent series with known available inventory. This study has practical implications for retailers and businesses that sell perishable products. Accurately forecasting sales of markdown products can help reduce waste and optimize inventory management, resulting in cost savings and increased profitability.