Zum Hauptinhalt springen

Showing 1–11 of 11 results for author: Taesiri, M R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.15295  [pdf, other

    cs.CV cs.SE

    VideoGameBunny: Towards vision assistants for video games

    Authors: Mohammad Reza Taesiri, Cor-Paul Bezemer

    Abstract: Large multimodal models (LMMs) hold substantial promise across various domains, from personal assistance in daily tasks to sophisticated applications like medical diagnostics. However, their capabilities have limitations in the video game domain, such as challenges with scene understanding, hallucinations, and inaccurate descriptions of video game content, especially in open-source models. This pa… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  2. arXiv:2407.06581  [pdf, other

    cs.AI cs.CV

    Vision language models are blind

    Authors: Pooyan Rahmanzadehgervi, Logan Bolton, Mohammad Reza Taesiri, Anh Totti Nguyen

    Abstract: While large language models with vision capabilities (VLMs), e.g., GPT-4o and Gemini 1.5 Pro, are powering various image-text applications and scoring high on many vision-understanding benchmarks, we find that they are surprisingly still struggling with low-level vision tasks that are easy to humans. Specifically, on BlindTest, our suite of 7 very simple tasks such as identifying (a) whether two c… ▽ More

    Submitted 25 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

  3. arXiv:2404.05238  [pdf, other

    cs.CV cs.HC

    Allowing humans to interactively guide machines where to look does not always improve human-AI team's classification accuracy

    Authors: Giang Nguyen, Mohammad Reza Taesiri, Sunnie S. Y. Kim, Anh Nguyen

    Abstract: Via thousands of papers in Explainable AI (XAI), attention maps \cite{vaswani2017attention} and feature importance maps \cite{bansal2020sam} have been established as a common means for finding how important each input feature is to an AI's decisions. It is an interesting, unexplored question whether allowing users to edit the feature importance at test time would improve a human-AI team's accuracy… ▽ More

    Submitted 20 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted for presentation at the XAI4CV Workshop, part of the CVPR 2024 proceedings

  4. arXiv:2312.05291  [pdf, other

    cs.CV cs.AI cs.CL

    GlitchBench: Can large multimodal models detect video game glitches?

    Authors: Mohammad Reza Taesiri, Tianjun Feng, Anh Nguyen, Cor-Paul Bezemer

    Abstract: Large multimodal models (LMMs) have evolved from large language models (LLMs) to integrate multiple input modalities, such as visual inputs. This integration augments the capacity of LLMs for tasks requiring visual comprehension and reasoning. However, the extent and limitations of their enhanced abilities are not fully understood, especially when it comes to real-world tasks. To address this gap,… ▽ More

    Submitted 29 March, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: CVPR 2024

  5. arXiv:2308.13651  [pdf, other

    cs.CV cs.HC

    PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans

    Authors: Giang, Nguyen, Valerie Chen, Mohammad Reza Taesiri, Anh Totti Nguyen

    Abstract: Nearest neighbors (NN) are traditionally used to compute final decisions, e.g., in Support Vector Machines or k-NN classifiers, and to provide users with explanations for the model's decision. In this paper, we show a novel utility of nearest neighbors: To improve predictions of a frozen, pretrained image classifier C. We leverage an image comparator S that (1) compares the input image with NN ima… ▽ More

    Submitted 26 August, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: Accepted to Transaction on Machine Learning Research 2024; 50 pages, 35 Figures & 17 Tables

  6. arXiv:2304.05538  [pdf, other

    cs.CV

    ImageNet-Hard: The Hardest Images Remaining from a Study of the Power of Zoom and Spatial Biases in Image Classification

    Authors: Mohammad Reza Taesiri, Giang Nguyen, Sarra Habchi, Cor-Paul Bezemer, Anh Nguyen

    Abstract: Image classifiers are information-discarding machines, by design. Yet, how these models discard information remains mysterious. We hypothesize that one way for image classifiers to reach high accuracy is to first zoom to the most discriminative region in the image and then extract features from there to predict image labels, discarding the rest of the image. Studying six popular networks ranging f… ▽ More

    Submitted 8 October, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023 Track on Datasets and Benchmarks

  7. arXiv:2210.02506  [pdf, other

    cs.CL cs.SE

    Large Language Models are Pretty Good Zero-Shot Video Game Bug Detectors

    Authors: Mohammad Reza Taesiri, Finlay Macklon, Yihe Wang, Hengshuo Shen, Cor-Paul Bezemer

    Abstract: Video game testing requires game-specific knowledge as well as common sense reasoning about the events in the game. While AI-driven agents can satisfy the first requirement, it is not yet possible to meet the second requirement automatically. Therefore, video game testing often still relies on manual testing, and human testers are required to play the game thoroughly to detect bugs. As a result, i… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

  8. arXiv:2208.02335  [pdf, other

    cs.SE

    Automatically Detecting Visual Bugs in HTML5 <canvas> Games

    Authors: Finlay Macklon, Mohammad Reza Taesiri, Markos Viggiato, Stefan Antoszko, Natalia Romanova, Dale Paas, Cor-Paul Bezemer

    Abstract: The HTML5 <canvas> is used to display high quality graphics in web applications such as web games (i.e., <canvas> games). However, automatically testing <canvas> games is not possible with existing web testing techniques and tools, and manual testing is laborious. Many widely used web testing tools rely on the Document Object Model (DOM) to drive web test automation, but the contents of the <canva… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

    Comments: Accepted at ASE 2022 conference

  9. arXiv:2208.00780  [pdf, other

    cs.CV cs.AI cs.HC cs.LG

    Visual correspondence-based explanations improve AI robustness and human-AI team accuracy

    Authors: Giang Nguyen, Mohammad Reza Taesiri, Anh Nguyen

    Abstract: Explaining artificial intelligence (AI) predictions is increasingly important and even imperative in many high-stakes applications where humans are the ultimate decision-makers. In this work, we propose two novel architectures of self-interpretable image classifiers that first explain, and then predict (as opposed to post-hoc explanations) by harnessing the visual correspondences between a query i… ▽ More

    Submitted 30 August, 2023; v1 submitted 26 July, 2022; originally announced August 2022.

    Comments: NeurIPS 2022 conference paper

  10. arXiv:2203.11096  [pdf, other

    cs.CV cs.SE

    CLIP meets GamePhysics: Towards bug identification in gameplay videos using zero-shot transfer learning

    Authors: Mohammad Reza Taesiri, Finlay Macklon, Cor-Paul Bezemer

    Abstract: Gameplay videos contain rich information about how players interact with the game and how the game responds. Sharing gameplay videos on social media platforms, such as Reddit, has become a common practice for many players. Often, players will share gameplay videos that showcase video game bugs. Such gameplay videos are software artifacts that can be utilized for game testing, as they provide insig… ▽ More

    Submitted 22 March, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: Accepted by MSR 2022 conference

  11. arXiv:2109.12321  [pdf, other

    cs.SI cs.LG

    Under the Skin of Foundation NFT Auctions

    Authors: MohammadAmin Fazli, Ali Owfi, Mohammad Reza Taesiri

    Abstract: Non Fungible Tokens (NFTs) have gained a solid foothold within the crypto community, and substantial amounts of money have been allocated to their trades. In this paper, we studied one of the most prominent marketplaces dedicated to NFT auctions and trades, Foundation. We analyzed the activities on Foundation and identified several intriguing underlying dynamics that occur on this platform. Moreov… ▽ More

    Submitted 25 September, 2021; originally announced September 2021.