The industry is shifting towards autonomous systems capable of detecting and adapting to machine hardware faults. Traditional fault tolerance involves duplicating components and reconfiguring processes, but reinforcement learning (RL) offers a new approach. This paper explores the potential of two RL algorithms, PPO and SAC, for enhancing hardware fault tolerance in machines. Tested in OpenAI Gym environments (Ant-v2, FetchReach-v1) with six simulated faults, results show RL enables rapid adaptation. PPO adapts best by retaining knowledge, while SAC performs better when discarding it, highlighting RL's potential for developing robust, adaptive machines.
Deep reinforcement learning (DRL) has been successful in robotics but lacks explainability. This research proposes using Graph Networks and Layer-wise Relevance Propagation (LRP) to analyze the learned representations of a robot's observations. By representing observations as entity-relationship graphs, we can interpret the robot's decision-making process, compare different policies, and understand how the robot recovers from errors. This approach contributes to making DRL in robotics more transparent and understandable.
We propose novel grey-box Bayesian optimization algorithms, which focuses on learning domain-specific knowledge inherent to the function being optimized. This leads to designing acquisition functions that incorporate the knowledge into the decision-making process for choosing the next candidate point. In the example provided below, step 1 calculates a credit for each parameter of the candidate point (brighter colors mean more credit). We construct an information profiles that is used for calculating the expected information gain of each parameter in step 2. In step 2, we use the previous version of the expected information gain of the best answer that we have found so far, x_star. The expected informatio gain then tells us the next query point.
We propose a framework that at its heart, lies an optimization algorithm. The algorithm iteratively observes the previous outcome of a sensor placement and accordingly proposes the next one. The candidate sensor placement goes to a simulation software that produces a synthetic, but realistic datasets according to the occupants daily activity plans, indoor space layout and sensors. The dataset consists of occupants activities and corresponding sensor readings. This dataset is then used by an activity classifier to train and test an activity recognition model. The performance of the activity recognition model is reported as the quality of the candidate sensor placement.
Optimizing sensor placement in smart homes and buildings is challenging due to the time and cost of real-world testing. This research presents a simulation tool called SIMsis that models indoor spaces, occupant activities, and sensor behaviors. SIMsis generates realistic sensor data over time, which can be used to evaluate different sensor configurations without physical experimentation. We tested SIMsis in a smart home setting (Real-World Measurements) and found it effectively simulates real-world conditions, making it a valuable tool for developing and deploying sensor-based applications.
golestan@ualberta.ca
golestan@amii.ca