The project has resulted in the publications listed below. Use the "»" button to reveal an abstract of each publication, and "«" to hide the abstract again (requires Javascript).
Journal papers
-
L. Busoniu, V. Varma, J. Loheac, A. Codrean, O. Stefan, C. Morarescu, S. Lasaulce,
Learning control for transmission and navigation with a mobile robot under unknown communication rates.
Control Engineering Practice,
vol. 100,
2020.
In press.
»
Abstract: In tasks such as surveying or monitoring remote regions, an autonomous robot must move while transmitting data over a wireless network with unknown, position-dependent transmission rates. For such a robot, this paper considers the problem of transmitting a data buffer in minimum time, while possibly also navigating towards a goal position. Two approaches are proposed, each consisting of a machine-learning component that estimates the rate function from samples; and of an optimal-control component that moves the robot given the current rate function estimate. Simple obstacle avoidance is performed for the case without a goal position. In extensive simulations, these methods achieve competitive performance compared to known-rate and unknown-rate baselines. A real indoor experiment is provided in which a Parrot AR.Drone 2 successfully learns to transmit the buffer.
Online at ScienceDirect.
«
-
D. Mezei, L. Tamas, L. Busoniu,
Sorting Objects from a Conveyor Belt Using POMDPs with Multiple-Object Observations and Information-Gain Rewards.
Sensors,
vol. 20,
no. 9,
2020.
»
Abstract: We consider a robot that must sort objects transported by a conveyor belt into different classes. Multiple observations must be performed before taking a decision on the class of each object, because the imperfect sensing sometimes detects the incorrect object class. The objective is to sort the sequence of objects in a minimal number of observation and decision steps. We describe this task in the framework of partially observable Markov decision processes, and we propose a reward function that explicitly takes into account the information gain of the viewpoint selection actions applied. The DESPOT algorithm is applied to solve the problem, automatically obtaining a sequence of observation viewpoints and class decision actions. Observations are made either only for the object on the first position of the conveyor belt or for multiple adjacent positions at once. The performance of the single- and multiple-position variants is compared, and the impact of including the information gain is analyzed. Real-life experiments with a Baxter robot and an industrial conveyor belt are provided.
Online at MDPI.
«
-
G. Feng, L. Busoniu, T.M. Guerra, S. Mohammad,
Data-Efficient Reinforcement Learning for Energy Optimization of Power-Assisted Wheelchairs.
IEEE Transactions on Industrial Electronics,
vol. 66,
no. 12,
pages 97340–9744,
2019.
»
Abstract: The objective of this paper is to develop a method for assisting users to push power-assisted wheelchairs (PAWs) in such a way that the electrical energy consumption over a predefined distance-to-go is optimal, while at the same time bringing users to a desired fatigue level. This assistive task is formulated as an optimal control problem and solved by Feng et al. using the model-free approach gradient of partially observable Markov decision processes. To increase the data efficiency of the model-free framework, we here propose to use policy learning by weighting exploration with the returns (PoWER) with 25 control parameters. Moreover, we provide a new near-optimality analysis of the finite-horizon fuzzy Q-iteration, which derives a model-based baseline solution to verify numerically the near-optimality of the presented model-free approaches. Simulation results show that the PoWER algorithm with the new parameterization converges to a near-optimal solution within 200 trials and possesses the adaptability to cope with changes of the human fatigue dynamics. Finally, 24 experimental trials are carried out on the PAW system, with fatigue feedback provided by the user via a joystick. The performance tends to increase gradually after learning. The results obtained demonstrate the effectiveness and the feasibility of PoWER in our application.
Online at IEEEXplore.
«
-
L. Busoniu, J. Ben Rejeb, I. Lal, I.-C. Morarescu, J. Daafouz,
Optimistic minimax search for noncooperative switched control with or without dwell time.
Automatica,
vol. 112,
2020.
»
Abstract: We consider adversarial problems in which two agents control two switching signals, the first agent aiming to maximize a discounted sum of rewards, and the second aiming to minimize it. Both signals may be subject to constraints on the dwell time after a switch. We search the tree of possible mode sequences with an algorithm called optimistic minimax search with dwell time (OMSd), showing that it obtains a solution close to the minimax-optimal one, and we characterize the rate at which the suboptimality goes to zero. The analysis is driven by a novel measure of problem complexity, and it is first given in the general dwell-time case, after which it is specialized to the unconstrained case. We exemplify the framework for networked control systems where the minimizer signal is a discrete time delay on the control channel, and we provide extensive simulations and a real-time experiment for nonlinear systems of this type.
Online at ScienceDirect.
«
-
R. Frohlich, L. Tamas, Z. Kato,
Absolute Pose Estimation of Central Cameras Using Planar Regions.
IEEE Transactions on Pattern Analysis and Machine Intelligence,
2019.
In press.
»
Abstract: A novel method is proposed for the absolute pose estimation of a central 2D camera with respect to 3D depth data without the use of any dedicated calibration pattern or explicit point correspondences. The proposed method has no specific assumption about the data source: plain depth information is expected from the 3D sensing device and a central camera is used to capture the 2D images. Both the perspective and omnidirectional central cameras are handled within a single generic camera model. Pose estimation is formulated as a 2D-3D nonlinear shape registration task which is solved without point correspondences or complex similarity metrics. It relies on a set of corresponding planar regions, and the pose parameters are obtained by solving an overdetermined system of nonlinear equations. The efficiency and robustness of the proposed method were confirmed on both large scale synthetic data and on real data acquired from various types of sensors.
Online at IEEE.
«
-
L. Busoniu, T. de Bruin, D. Tolic, J. Kober, I. Palunko,
Reinforcement Learning for Control: Performance, Stability, and Deep
Approximators.
Annual Reviews in Control,
vol. 46,
pages 8–28,
2018.
»
Abstract: Reinforcement learning (RL) offers powerful algorithms to search for optimal controllers of systems with nonlinear, possibly stochastic dynamics that are unknown or highly uncertain. This review mainly covers artificial-intelligence approaches to RL, from the viewpoint of the control engineer. We explain how approximate representations of the solution make RL feasible for problems with continuous states and control actions. Stability is a central concern in control, and we argue that while the control-theoretic RL subfield called adaptive dynamic programming is dedicated to it, stability of RL largely remains an open question. We also cover in detail the case where deep neural networks are used for approximation, leading to the field of deep RL, which has shown great success in recent years. With the control practitioner in mind, we outline opportunities and pitfalls of deep RL; and we close the survey with an outlook that - among other things - points out some avenues for bridging the gap between control and artificial-intelligence RL techniques.
Online at ScienceDirect.
«
Conference papers
-
Csanad Sandor, Szabolcs Pavel, Wieser Erik, Andreea Blaga, Peter Boda, Andrea-Orsolya Fulop, Adrian Ursache, Attila Zold, Aniko Kopacz, Botond Lazar, Karoly Szabo, Zoltan Tasnadi, Botond Trinfa, Lehel Csato, Dan Marius Tegzes, Marian Leontin Pop, Raluca Alexandra Tarziu, Mihai-Valentin Zaha, Sorin Mihai Grigorescu, Lucian Busoniu, Paula Raica, Levente Tamas,
The ClujUAV student competition: A corridor navigation challenge with autonomous drones.
Accepted at
21st IFAC World Congress (IFAC-20),
Online (Berlin, Germany),
12–17 July
2020.
»
Abstract: We describe a novel student contest concept in which an unmanned aerial vehicle (UAV or drone) must autonomously navigate a straight corridor using feedback from camera images. The objective of the contest is to promote engineering skills (related to sensing and control in particular) among students and young professionals, by means of an attractive robotics topic in an exciting competition format. The first edition of this contest was organized in Cluj-Napoca, Romania on October 19th 2019. Teams from industry and academia competed, with an overall positive experience. We outline the challenge and scoring rules, together with the technical solutions of the teams, and close with a summary of the results and points to improve for the next editions.
«
-
K. Mirkamali, L. Busoniu,
Cross Entropy Optimization of Action Modification Policies for Continuous-Valued MDPs.
Accepted at
21st IFAC World Congress (IFAC-20),
Online (Berlin, Germany),
12–17 July
2020.
»
Abstract: We propose an algorithm to search for parametrized policies in continuous state and action Markov Decision Processes (MDPs). The policies are represented via a number of basis functions, and the main novelty is that each basis function corresponds to a small, discrete modification of the continuous action. In each state, the policy chooses a discrete action modification associated with a basis function having the maximum value at the current state. Empirical returns from a representative set of initial states are estimated in simulations to evaluate the policies. Instead of using slow gradient-based algorithms, we apply cross entropy method for updating the parameters. The proposed algorithm is applied to a double integrator and an inverted pendulum problem, with encouraging results.
«
-
I. Lal, A. Codrean, L. Busoniu,
Sliding mode control of a ball balancing robot.
Accepted at
21st IFAC World Congress (IFAC-20),
Online (Berlin, Germany),
12–17 July
2020.
»
Abstract: This paper presents a sliding mode control design for a ball-balancing robot (ballbot), with associated real-time results. The sliding mode control is designed based on the linearized plant model, and is robust to matched uncertainties. The design is considerably simpler than other nonlinear control strategies presented in the literature, and the experimental results for stabilization and tracking show much better performances than those obtained with linear control (in particular, a linear quadratic regulator).
«
-
Z. Nagy, Zs. Lendek, L. Busoniu,
Control and Estimation for Mobile Sensor-Target Problems with Distance-Dependent Noise.
Accepted at
IEEE American Control Conference (ACC-20),
Online (Denver, Colorado),
1–3 July
2020.
-
M. Granzotto, R. Postoyan, L. Busoniu, D. Nesic, J. Daafouz,
Optimistic planning for the near-optimal control of nonlinear switched discrete-time systems with stability guarantees.
In
Proceedings 58th IEEE Conference on Decision and Control (CDC-19),
Nice, France,
11–13 December
2019.
»
Abstract: Originating in the artificial intelligence literature, optimistic planning (OP) is an algorithm that generates near-optimal control inputs for generic nonlinear discrete-time systems whose input set is finite. This technique is therefore relevant for the near-optimal control of nonlinear switched systems, for which the switching signal is the control. However, OP exhibits several limitations, which prevent its application in a standard control context. First, it requires the stage cost to take values in [0, 1], an unnatural prerequisite as it excludes, for instance, quadratic stage costs. Second, it requires the cost function to be discounted. Third, it applies for reward maximization, and not cost minimization. In this paper, we modify OP to overcome these limitations, and we call the new algorithm OPmin. We then make stabilizability and detectability assumptions, under which we derive near-optimality guarantees for OPmin and we show that the obtained bound has major advantages compared to the bound originally given by OP. In addition, we prove that a system whose inputs are generated by OPmin in a receding-horizon fashion exhibits stability properties. As a result, OPmin provides a new tool for the near-optimal, stable control of nonlinear switched discrete-time systems for generic cost functions.
«
-
L. Busoniu, J. Daafouz, C. Morarescu,
Near-optimal control of nonlinear systems with simultaneous controlled and random switches.
In
5th IFAC Conference on Intelligent Control and Automation Sciences (ICONS-19),
pages 268–273,
Belfast, Northern Ireland,
21–23 August
2019.
»
Abstract: We consider dual switched systems, in which two switching signals act simultaneously to select the dynamical mode. The first signal is controlled and the second is random, with probabilities that evolve either periodically or as a function of the dwell time. We formalize both cases as Markov decision processes, which allows them to be solved with a simple approximate dynamic programming algorithm. We illustrate the framework in a problem where the random signal is a delay on the control channel that is used to send the controlled signal to the system.
Online at ScienceDirect.
«
-
A.-D. Mezei, L. Tamas, L. Busoniu,
Sorting objects from a conveyor belt using active perception with a POMDP model.
In
18th IEEE European Control Conference (ECC-19),
pages 2466–2471,
Napoli, Italy,
25–28 June
2019.
»
Abstract: We consider an application where a robot must sort objects traveling on a conveyor belt into different classes. The detector and classifier work on 3D point clouds, but are of course not fully accurate, so they sometimes misclassify objects. We describe this task using a novel model in the formalism of partially observable Markov decision processes. With the objective of finding the correct classes with a small number of observations, we then apply a state-of-the-art POMDP solver to plan a sequence of observations from different viewpoints, as well as the moments when the robot decides the class of the current object (which automatically triggers sorting and moving the conveyor belt). In a first version, observations are carried out only for the object at the end of the conveyor belt, after which we extend the framework to observe multiple objects. The performance with both versions is analyzed in simulations, in which we study the ratio of correct to incorrect classifications and the total number of steps to sort a batch of objects. Real-life experiments with a Baxter robot are then provided with publicly shared code and data at http://community.clujit.ro/display/TEAM/Active+perception.
Online at IEEEXplore.
«
-
I. Lal, M. Nicoara, A. Codrean, L. Busoniu,
Hardware and Control Design of a Ball Balancing Robot.
In
22nd IEEE International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS-19),
Cluj-Napoca, Romania,
24–26 April
2019.
»
Abstract: This paper presents the construction of a new ball balancing robot (ballbot), together with the design of a controller to balance it vertically around a given position in the plane. Requirements on physical size and agility lead to the choice of ball, motors, gears, omnidirectional wheels, and body frame. The electronic hardware architecture is presented in detail, together with timing results showing that real-time control can be achieved. Finally, we design a linear quadratic regulator for balancing, starting from a 2D model of the robot. Experimental balancing results are satisfactory, maintaining the robot in a disc 0.3 m in diameter.
Online at IEEEXplore.
«
-
L. Busoniu, V. S. Varma, I.-C. Morarescu, S. Lasaulce,
Learning-based control for a communicating mobile robot under unknown rates.
In
IEEE American Control Conference (ACC-19),
pages 267–272,
Philadelphia, USA,
10–12 July
2019.
»
Abstract: In problems such as surveying or monitoring remote regions, a mobile robot must transmit data over a wireless network with unknown, position-dependent transmission rates. We propose an algorithm to achieve this objective that learns approximations of the rate function and of an optimal-control solution that transmits the data in minimum time. The rates are estimated with supervised learning from the samples observed; and the control is found with dynamic programming sweeps around the current state of the robot that exploit the rate function estimate, combined with online reinforcement learning. For both synthetic and realistic rate functions, our experiments show that the learning algorithm empties the data buffer in less than twice the number of steps achieved by a model-based solution that requires to perfectly know the rate function.
Online at IEEEXplore.
«
-
C. Iuga, P. Dragan, L. Busoniu,
Fall monitoring and detection for at-risk persons using a UAV.
In
Proceedings IFAC Conference on Embedded Systems, Computational Intelligence and Telematics in Control (CESCIT 2018),
Faro, Portugal,
6–8 June
2018.
»
Abstract: We describe a demonstrator application that uses a UAV to monitor and detect falls of an at-risk person. The position and state (upright or fallen) of the person are determined with deep-learning-based computer vision, where existing network weights are used for position detection, while for fall detection the last layer is fine-tuned in additional training. A simple visual servoing control strategy keeps the person in view of the drone, and maintains the drone at a set distance from the person. In experiments, falls were reliably detected, and the algorithm was able to successfully track the person indoors.
Online at ScienceDirect.
«
Disclaimer: The following applies to the papers that are directly available for download as PDF files. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each copyright holder. In most cases, these works may not be reposted without the explicit permission of the copyright holder. Additionally, the following applies to IEEE material: Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE.
|