Professor Peter Vamplew
Associate Dean Research
Campus
Biography
Professor Peter Vamplew’s information technology expertise focuses on artificial intelligence, particularly reinforcement learning. Prof Vamplew is currently researching variations on reinforcement learning algorithms for multi-objective problems, which contribute to the explainability and safety of autonomous AI systems. Peter’s research has been published widely in highly-ranked international journals.
Peter co-leads the Australian Responsible Autonomous Agents Collective, a multi-institution research group which focuses on reinforcement learning and related topics. He is also a senior member of the Future of Life Institute’s Existential AI Risk research community.
Peter has been a Professor in Information Technology at Federation University Australia since 2023, and was previously an Associate Professor at Federation (2014-2023), and Senior Lecturer at the University of Ballarat (2005-2014). Prior to that he was a lecturer within the computing discipline at the University of Tasmania from 1991–2005, where he received his PhD in 1996.
- Publications
On Generalization Across Environments In Multi-Objective Reinforcement Learning
An empirical investigation of value-based multi-objective reinforcement learning for stochastic environments
- Journals
- DOI reference: 10.1017/S0269888925100052
AI apology: a critical review of apology in AI systems
- Journals
- DOI reference: 10.1007/s10462-025-11305-8
Elastic step DQN: A novel multi-step algorithm to alleviate overestimation in Deep Q-Networks
- Journals
- DOI reference: 10.1016/j.neucom.2023.127170
Assessing the impact of griefing in MMORPGs using self-determination theory
- Journals
- DOI reference: 10.1016/j.chb.2024.108388
Position: Intent-aligned AI Systems Must Optimize for Agency Preservation
Griefing in MMORPGs
- Dictionary/Encyclopaedia
- DOI reference: 10.1007/978-3-031-23161-2_200
Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning
- Conference Proceedings
- DOI reference: 10.48550/arxiv.2402.02665
AI apology: interactive multi-objective reinforcement learning for human-aligned AI
- Journals
- DOI reference: 10.1007/s00521-023-08586-x
A NetHack Learning Environment Language Wrapper for Autonomous Agents
- Journals
- DOI reference: 10.5334/JORS.444
Elastic step DDPG: Multi-step reinforcement learning for improved sample efficiency
- Conference Proceedings
- DOI reference: 10.1109/IJCNN54540.2023.10191774
A Brief Guide to Multi-Objective Reinforcement Learning and Planning JAAMAS track
Explainable reinforcement learning for broad-XAI: a conceptual framework and survey
- Journals
- DOI reference: 10.1007/s00521-023-08423-1
Persistent rule-based interactive reinforcement learning
- Journals
- DOI reference: 10.1007/s00521-021-06466-w
A conceptual framework for externally-influenced agents: an assisted reinforcement learning review
- Journals
- DOI reference: 10.1007/s12652-021-03489-y
Human engagement providing evaluative and informative advice for interactive reinforcement learning
- Journals
- DOI reference: 10.1007/s00521-021-06850-6
Scalar reward is not enough: a response to Silver, Singh, Precup and Sutton (2021)
- Journals
- DOI reference: 10.1007/s10458-022-09575-5
A Low-Level Hybrid Intrusion Detection System Based on Hardware Performance Counters
- Book Chapters
- DOI reference: 10.1201/9781003194538-9
Evaluating Human-like Explanations for Robot Actions in Reinforcement Learning Scenarios
- Conference Proceedings
- DOI reference: 10.1109/IROS47612.2022.9981334
Discrete-to-deep reinforcement learning methods
- Journals
- DOI reference: 10.1007/s00521-021-06270-6
An online scalarization multi-objective reinforcement learning algorithm: TOPSIS Q-learning
- Journals
- DOI reference: 10.1017/S0269888921000163
A practical guide to multi-objective reinforcement learning and planning
- Journals
- DOI reference: 10.1007/s10458-022-09552-y
An evaluation methodology for interactive reinforcement learning with simulated users
- Journals
- DOI reference: 10.3390/biomimetics6010013
Explainable robotic systems: understanding goal-driven actions in a reinforcement learning scenario
- Journals
- DOI reference: 10.1007/s00521-021-06425-5
Language Representations for Generalization in Reinforcement Learning
Potential-based multiobjective reinforcement learning approaches to low-impact agents for AI safety
- Journals
- DOI reference: 10.1016/j.engappai.2021.104186
A Prioritized objective actor-critic method for deep reinforcement learning
- Journals
- DOI reference: 10.1007/s00521-021-05795-0
The impact of environmental stochasticity on value-based multiobjective reinforcement learning
- Journals
- DOI reference: 10.1007/s00521-021-05859-1
Levels of explainable artificial intelligence for human-aligned conversational explanations
- Journals
- DOI reference: 10.1016/j.artint.2021.103525
Discrete-to-Deep Supervised Policy Learning An effective training method for neural reinforcement learning
Reanimating historic malware samples
- Book Chapters
- DOI reference: 10.1007/978-3-030-62582-5_13
Motivational Factors of Australian Mobile Gamers
- Conference Proceedings
- DOI reference: 10.1145/3373017.3373066
Griefing in MMORPGs
- Dictionary/Encyclopaedia
- DOI reference: 10.1007/978-3-319-08234-9_200-1
Function Similarity Using Family Context
- Journals
- DOI reference: 10.3390/electronics9071163
A multi-objective deep reinforcement learning framework
- Journals
- DOI reference: 10.1016/j.engappai.2020.103915
API Based Discrimination of Ransomware and Benign Cryptographic Programs
- Conference Proceedings
- DOI reference: 10.1007/978-3-030-63833-7_15
Identifying cross-version function similarity using contextual features
- Conference Proceedings
- DOI reference: 10.1109/TrustCom50675.2020.00110
Enhancing Model Performance for Fraud Detection by Feature Engineering and Compact Unified Expressions
- Conference Proceedings
- DOI reference: 10.1007/978-3-030-38961-1_35
Hybrid intrusion detection system based on the stacking ensemble of C5 decision tree classifier and one class support vector machine
- Journals
- DOI reference: 10.3390/electronics9010173
Evolved similarity techniques in Malware Analysis
- Conference Proceedings
- DOI reference: 10.1109/TrustCom/BigDataSE.2019.00061
A novel ensemble of hybrid intrusion detection system for detecting internet of things attacks
- Journals
- DOI reference: 10.3390/electronics8111210
An Empirical Study of Reward Structures for Actor-Critic Reinforcement Learning in Air Combat Manoeuvring Simulation
- Conference Proceedings
- DOI reference: 10.1007/978-3-030-35288-2_5
Memory-Based Explainable Reinforcement Learning
- Conference Proceedings
- DOI reference: 10.1007/978-3-030-35288-2_6
Integrating Biological Heuristics and Gene Expression Data for Gene Regulatory Network Inference
- Conference Proceedings
- DOI reference: 10.1145/3290688.3290741
Categorical features transformation with compact one-hot encoder for fraud detection in distributed environment
- Conference Proceedings
- DOI reference: 10.1007/978-981-13-6661-1_6
Survey of intrusion detection systems:techniques, datasets and challenges
- Journals
- DOI reference: 10.1186/s42400-019-0038-7
An anomaly intrusion detection system using C5 decision tree classifier
- Conference Proceedings
- DOI reference: 10.1007/978-3-030-04503-6_14
Rapid anomaly detection using integrated prudence analysis (IPA)
- Conference Proceedings
- DOI reference: 10.1007/978-3-030-04503-6_12
SoniFight: Software to Provide Additional Sonification Cues to Video Games for Visually Impaired Players
- Journals
- DOI reference: 10.1007/s40869-018-0059-6
Participant observation of griefing in a journey through the World of Warcraft
Human-aligned artificial intelligence is a multiobjective problem
- Journals
- DOI reference: 10.1007/s10676-017-9440-6
Non-functional regression: A new challenge for neural networks
- Journals
- DOI reference: 10.1016/j.neucom.2018.06.066
Special issue on multi-objective reinforcement learning
- Journals
- DOI reference: 10.1016/j.neucom.2017.06.020
An agile group aware process beyond CRISP-DM: A hospital data mining case study
- Conference Proceedings
- DOI reference: 10.1145/3093241.3093273
Evaluating accuracy in prudence analysis for cyber security
- Conference Proceedings
- DOI reference: 10.1007/978-3-319-70139-4_41
Steering approaches to Pareto-optimal multiobjective reinforcement learning
- Journals
- DOI reference: 10.1016/j.neucom.2016.08.152
Softmax exploration strategies for multiobjective reinforcement learning
- Journals
- DOI reference: 10.1016/j.neucom.2016.09.141
A taxonomy of griefer type by motivation in massively multiplayer online role-playing games
- Journals
- DOI reference: 10.1080/0144929X.2017.1306109
Enhanced temporal difference learning using compiled eligibility traces
- Conference Proceedings
- DOI reference: 10.1007/11941439-18
An efficient approach to unbounded bi-objective archives: Introducing the Mak_Tree algorithm
Concurrent Q-learning: Reinforcement learning for dynamic goals and environments
The combative accretion model: Multiobjective optimisation without explicit pareto ranking
- Conference Proceedings
- DOI reference: 10.1007/978-3-540-31880-4_6
Global versus local constructive function approximation for on-line reinforcement learning
An efficient data structure for unbounded bi-objective archives: Introducing the mak_tree
On-line reinforcement learning using cascade constructive neural networks
LegoTM mindstormsTM robots as a platform for teaching reinforcement learning
Reducing the time complexity of goal-independent reinforcement learning
A language for platform independent communication and storage in multiobjective optimisation
PoD can mutate: A simple dynamic directed mutation approach for genetic algorithms
Generalised algorithms for redirected walking in virtual environments
Refining search queries from examples using boolean expressions and latent semantic analysis
A simplified artificial life model for multiobjective optimisation: A preliminary report
Accelerating real-valued genetic algorithms using mutation-with-momentum
Using Corpus Analysis to Inform Research into Opinion Detection in Blogs
On the limitations of scalarisation for multi-objective reinforcement learning of Pareto Fronts
- Conference Proceedings
- DOI reference: 10.1007/978-3-540-89378-3_37
Using Stereotypes to Improve Early-Match Poker Play
- Conference Proceedings
- DOI reference: 10.1007/978-3-540-89378-3_59
Unsupervised Color Textured Image Segmentation Using Cluster Ensembles and MRF Model
- Book Chapters
- DOI reference: 10.1007/978-1-4020-8741-7_59
A polynomial ring construction for the classification of data
- Journals
- DOI reference: 10.1017/S0004972708001111
Weblogs for market research: Finding more relevant opinion documents using system fusion
- Journals
- DOI reference: 10.1108/14684520911001882
Incorporating Expert Advice into Reinforcement Learning Using Constructive Neural Networks
- Conference Proceedings
- DOI reference: 10.1007/978-3-642-04512-7_11
Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks
- Conference Proceedings
- DOI reference: 10.1007/978-3-642-10439-8_35
Inference of Gene Expression Networks using Memetic Gene Expression Programming
Applying Clustering and Ensemble Clustering Approaches to Phishing Profiling
The Ballarat Incremental Knowledge Engine
- Conference Proceedings
- DOI reference: 10.1007/978-3-642-15037-1_17
Automated Opinion Detection: Implications of the Level of Agreement Between Human Raters
- Journals
- DOI reference: 10.1016/j.ipm.2009.08.005
Automatic sleep stage identification: difficulties and possible solutions
MRF model based unsupervised color textured image segmentation using multidimensional spatially variant finite mixture model
- Conference Proceedings
- DOI reference: 10.1007/978-90-481-3656-8_68
Unsupervised Segmentation of Industrial Images using Markov Random Field Model
- Conference Proceedings
- DOI reference: 10.1007/978-90-481-3656-8_67
Footy, flows and farms: a visualisation tool for determining community water allocation preferences
WINDSCREEN: A climate change visualisation tool for water allocation decisions
DETECTING K-COMPLEXES FOR SLEEP STAGE IDENTIFICATION USING NONSMOOTH OPTIMIZATION
- Journals
- DOI reference: 10.1017/S1446181112000016
Empirical evaluation methods for multiobjective reinforcement learning algorithms
- Journals
- DOI reference: 10.1007/s10994-010-5232-5
Reinforcement learning approach to AIBO robot's decision making process in Robosoccer's goal keeper problem
- Conference Proceedings
- DOI reference: 10.1109/SNPD.2011.39
Using psycholinguistic features for profiling first language of authors
- Journals
- DOI reference: 10.1002/asi.22627
Applications of machine learning for linguistic analysis of texts
- Book Chapters
- DOI reference: 10.4018/978-1-4666-1833-6.ch008
RM and RDM, a Preliminary Evaluation of Two Prudent RDR Techniques
- Conference Proceedings
- DOI reference: 10.1007/978-3-642-32541-0_16
An empirical comparison of two common multiobjective reinforcement learning algorithms
- Conference Proceedings
- DOI reference: 10.1007/978-3-642-35101-3_53
Prudent fraud detection in internet banking
- Conference Proceedings
- DOI reference: 10.1109/CTC.2012.13
A Survey of Multi-Objective Sequential Decision-Making
- Journals
- DOI reference: 10.1613/jair.3987
Ganking, corpse camping and ninja looting from the perception of the MMORPG community: Acceptable behavior or unacceptable griefing?
- Conference Proceedings
- DOI reference: 10.1145/2513002.2513007
Weblogs for market research: improving opinion detection using system fusion
- Conference Proceedings
- DOI reference: 10.1109/ICSSSM.2008.4598502
Griefers versus the Griefed - what motivates them to play Massively Multiplayer Online Role-Playing Games?
- Journals
- DOI reference: 10.1007%2FBF03392354
Patient admission prediction using a pruned fuzzy min-max neural network with rule extraction
- Journals
- DOI reference: 10.1007/s00521-014-1631-z
Reinforcement learning of pareto-optimal multiobjective policies using steering
- Conference Proceedings
- DOI reference: 10.1007/978-3-319-26350-2_53
A Heuristic Gene Regulatory Networks Model for Cardiac Function and Pathology
Caliko: An Inverse Kinematics Software Library Implementation of the FABRIK Algorithm
- Journals
- DOI reference: 10.5334/jors.116
