Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
Clear All
new posts

    14.1p VoiceAssist feedback

    Hi Isomorphic,

    I just tried Tripe-Control VoiceAssist here (v14.1p_2025-11-09) in the search field, just to try it, and the audio recognition is working great for me in Windows 11 Chromium 142. Pretty cool feature. There seems to be a bug, though. I say "Bermuda", and it's filled into the field, but then on Enter-Key the field is emptied again, and in the Developer Console docs it says "17:15:41.315:WARN:Log:Canceling value dictation". I also assume that the log level here is a bit too aggressive:

    Code:
    *17:15:10.845:INFO:Log:initialized
    *17:15:11.035:WARN:Log:New Class ID: 'EditContext' collides with ID of existing object with value 'function EditContext() { [native code] }'. Existing object will be replaced.
    This conflict would be avoided by disabling ISC Simple Names mode. See documentation for further information.
    *17:15:11.248:INFO:Log:Google Maps APIs loaded correctly
    *17:15:11.473:INFO:Log:isc.Page is loaded
    17:15:11.924:WARN:VoiceAssist:Speech key handler installed for 'Control' key
    17:15:15.043:WARN:Log:Feature enabled: true
    17:15:17.142:WARN:Log:Starting value dictation - isc_DynamicForm_0_TextItem_isc_OID_4
    17:15:18.844:WARN:Log:Value (interim): ber
    17:15:18.942:WARN:Log:Value (interim): berm
    17:15:19.152:WARN:Log:Value (interim): bermud
    17:15:19.180:WARN:Log:Value (interim): Bermuda
    17:15:19.833:WARN:Log:Canceling value dictation
    As written in the docs it does not work in Firefox 144, but it also does not work in Edge 142, which is unexpected.

    Trying this in my non-default browsers I wasn't logged in at smartclient.com and then could not use VoiceAssist, as it showed your "Ask to be enabled for AI samples popup".
    If this is using the browser's SpeechRecognition API, why does it require your AI package? Should not everything be handled by the browser/locally?
    Also, the close button in the "VoiceAssist activated"-Notify is a bit mislocated (see attachment).

    Best regards
    Blama

    PS: I'm pretty sure I did not see a "new feature"-blog post about this, and it seems to exist since 14.0. As it is (again) a pretty cool feature, I think a post about it would be good.
    Attached Files

    #2
    PS: In Edge 142 I get the browser "Do you want to allow" popup (see picture) and then a red dot in the tab title in the browser UI for a few seconds after a Ctrl-double tap. But no text is being recognised.

    Click image for larger version  Name:	VoiceAssist allow qm.png Views:	0 Size:	87.2 KB ID:	276707


    Code:
    17:24:54.743:WARN:Log:Starting value dictation - isc_DynamicForm_0_TextItem_isc_OID_6
    17:24:56.746:TMR5:WARN:Log:Dictated value:
    17:24:56.747:TMR5:WARN:Log:Stopping value dictation
    Last edited by Blama; 9 Nov 2025, 08:26.

    Comment


      #3

      Actually, it is working sporadically. I then get a "Bermuda, bermuda", when I definitely only said it once. So perhaps it is more of an Edge problem.

      Comment


        #4
        hi Blama, thanks for the feedback.

        This is a new feature, despite being ported back some way because, as you say, it's a bit cool. :) So, there will be a blog for this - we just didn't get to it yet.

        By way of detail - SpeechRecognition is a browser thing, but speech-to-text doesn't happen in the browser - Chromium uses a proprietary Google service, which is why it works in Chrome, but not in various Chromium-offshoots like Brave, where it is intentionally not implemented for security reasons - ie, sending voice-recordings to a third-party server. In the specific case of Brave, the SpeechRecognition class is present, but it's not fully implemented so there's no way to "turn it on" - you can try to start a recording without errors, but it doesn't register speech or fire any events and stops after a couple of seconds.

        Edge now uses a proprietary Microsoft Azure service, so differences in what gets recorded are likely down to that engine. We weren't aware of any differences in processing between Google's and Microsoft's services, but we'll do some testing with recent Edge and let you know if we find anything we can improve.

        On the issues:

        1) as you say - value-dictation doesn't need AI, so we'll make sure we don't perform that check - note that if you DO have AI, there are some niceties, like you can do voice-filter by holding down Control and saying "show me Calendar samples", for example.

        2) we don't see the issue where a value dictated to a filterEditor field gets cleared - even on our 14.0 online showcase we see this working. It could be related to the incorrect requirement for the AI to be present, if the recording is cancelled. We'll look a bit deeper, but please provide an exact sample/steps if it isn't just 'any Country sample, say "Bermuda"'!

        3) the logging has been fixed back to 14.1, for the moment - set VoiceAssist to DEBUG if you want to see all the detailed output

        4) We'll fix the misplaced close icon in one of the Notifies and get back to you
        Last edited by Isomorphic; 11 Nov 2025, 20:07.

        Comment


          #5
          hi Blama,

          A quick follow-up.

          We've made some changes - from tomorrow's builds, dated November 20:

          1) you shouldn't see any AI-related issues when dictating item-values

          2) we've made a change to pre-emptively ask for microphone permissions the first time you triple-tap Control to enable the feature - if it's not allowed, you'll get an isc.say() to that effect - if the browser pops the "Allow Microphone" popup, you'll get the same isc.say() if you disallow the permissions - if you allow the permissions, you'll then see the "VoiceAssist enabled" type Notify, and that means we know it'll work later when you actually try to start a recording

          3) the Azure Speech API that Edge uses is *considerably* slower than the Google service - both in the initial handshake and in providing each interim-transcription - we've made a quick change to wait longer for Edge to handshake, and that fixes the issue where it would stop recording before it got around to starting! Note that this slowness from Azure Speech also means that you'll need to wait until you see your spoken words appear on-screen before you double-tap Control to end dictation - otherwise, your dictation will be cut short. We're looking at improving this latter.

          4) in Edge, we actually *did* eventually see the issue where if you said, for example, "monaco", the filter value would end up being "Monaco. monaco". The duplication was a non-Chrome bug that's been fixed, and the "Monaco.", with the trailing ".", was caused by the grammar features of the Azure Speech service correcting "monaco" into "Monaco." But that breaks value-dictation in most cases, certainly in the case of filter-values, where punctuation will either break the search or, in some cases, invoke unexpected filtering behaviors. So we've prevented it from adding punctuation at the end.

          We're looking at the issue with the close button in the initial Notify moving after initial draw() - it's a Notify issue rather than VoiceAssist per-se - we'll update when we have more to say.
          Last edited by Isomorphic; 19 Nov 2025, 08:30.

          Comment


            #6
            hi Blama,

            Following up on this, we've made a bunch of changes to improve VoiceAssist across browsers.

            # we now hook all the SpeechRecognition events to better distinguish session-start/end from audio and speech start/end

            # the AI commandWindow/dictation-icon now won't show until the session has started and the mic is ready

            # browsers with slower services, like Azure Speech in Edge, will no longer clip speech - eg, if you double-tap Control and say "Bermuda", the value will appear almost instantly in Chrome - But if you do the same in Edge, you'll have plenty of time to double-tap Control again to stop recording manually before the text appears - if you do that now, the value-dictation icon will stay visible until the service returns its final transcript for processing, before the session fully ends

            # tweaks to cancel sessions when the final-transcript is empty, via several legitimate means

            # better error-handling and log details for network errors and browsers that block speech-to-text services

            You can try out these changes in today's builds, dated December 6 and later.

            Comment


              #7
              Hi Isomorphic,

              I tested a bit here (v14.1p_2025-12-06):
              1. With Firefox: I don't get a notification that Voice Assist is not supported. Instead the Notify let's me assume that it will work.
              2. With Edge 143 and Chromium 142: The sample is closed and instead "Bermuda" is put into the search field in the Showcase sample search.
              3. The Notify design seems to be repaired.
              4. The sample still requires one to be logged in.
              5. When you hold the "Control" button, the Notify shows "Command: Command:" ("Command" twice)
              Best regards
              Blama

              Comment


                #8
                hi Blama,

                We'll address #1 and #5.

                For #4, its not clear what you mean about logging in - you would only need to log in for AI samples, not normal ones like id=liveFilter - testing just now, we can use that sample at the link you showed without logging in, and it works as expected.

                For #2, we aren't seeing any problems in Edge or Chromium in the mentioned versions - double-tap value-dictation only starts if there's a focused formItem, and it can only affect that same FormItem.

                However, if you *have* logged in, so that AI is available, you may have *held-down* Control (to record a command) and said "Bermuda" - if you do that and AI is available, the AI would interpret that as a request for a sample-search and show that UI, as you described.

                Comment


                  #9
                  Hi Isomorphic,

                  thanks.
                  Regarding #4 in my last post and needing to be logged in, it is definitely related to holding the Ctrl-key, as you say.
                  So the steps are:
                  1. Go to the showcase (v14.1p_2025-12-06) while not logged in
                  2. Click the search field
                  3. 3x Ctrl to enable Voice Assist
                    1. Double tab Ctrl: Dictate works
                    2. Hold Ctrl: Message that I need to be logged in
                  The last point might be the issue here I think. Am I right in thinking that Voice Assist "just works" out of the box in BuiltInDS and other standalone examples in 14.0+ by just pressing 3x Ctrl?
                  Then the problem is not really a Showcase problem, but shows up in these two cases:
                  • What happens if one uses v14.1+ and does not have AI? Then the Notify should not say "Hold down Control to record a command or double tab to dictate a value", but only "Double tab Control to dictate a value" or similar.
                  • Similar in v14.0 (v14.0p_2025-12-06), which has Voice Assist, but not AI

                  Regarding #2 in my last post:
                  That's it. If Command automatically means "Search a sample", then that's exactly what's happening. It just wasn't clear to me.
                  Can you explain what a Command is? Does every screen need to set up possible commands or is this always relative to the current screen? So should "Allow editing in the displayed ListGrid" always just set canEdit: true for the ListGrid in focus or something like that?
                  I also have a general suggestion here: It took me as a non-native English speaker 5 times to get this command right with no typos. Perhaps it is a good idea to show the text that is coming back from VA in a textfield, so that it can be edited before executing it as a command.

                  Best regards
                  Blama

                  PS: I wrote in the last post in #3 that the Notify-close button issue is fixed. This is not true, still happening, but differently. The X first shows up correctly, but then moves and covers part of the text.

                  Comment


                    #10
                    VoiceAssist is available out of the box, yes, given a call to isc.VoiceAssist.enable(). We agree, those notify-instances should only mention the available features.

                    In this context, commands are prompts for the AI - when a Canvas implements supportsVoiceCommands() and has focus, its doVoiceCommand() implementation is passed command-transcriptions from VoiceAssist for actioning, for example by passing them on to an AI if one is available.

                    In the Showcase, when AI is available, the FeatureExplorer implements those methods and intercepts all voice-*commands* (but not double-tap values) - right now, the implementation asks the AI if the transcription sounds like it relates to a grid - if not, it is treated like a sample-search request. If it's a grid-related command and a grid is focused or one is present in the current sample, the command is passed as the parameter to grid.configureViaAI(). You can try this out by holding Control and asking for something grids do, like "sort by country" or "group by continent".

                    On the question of allowing commands to be edited before being forwarded to doVoiceCommand(), we agree that makes sense - we'll consider a flag that provides a means of delaying command-transcriptions pending user-approval.

                    Finally, on your P.S - hmm, we do see that the close-icon in still mispositioned in 14.0 only, which is peculiar, because it should no longer be showing at all! We'll take a look.

                    Comment


                      #11
                      Hi Isomorphic

                      thanks, that's great.

                      Another suggestion is this:
                      When you dictate into the Showcase Sample search and then the word you said is displayed and the green microphone logo is still being displayed and you press "Enter", the already recognised text is removed.
                      My suggestion for a different handling would be:
                      • In case of "Enter", the speech recognition should just stop and the "Enter"-press bubble though to the TextItem
                      • Currently, the already recognised text is also removed for "Esc", but here I also expect it like this.
                      • For any other "normal" keypress, I'd expect to amend the already recognised text. Probably this means the same handling as for "Enter"
                      Best regards
                      Blama

                      Comment

                      Working...
                      X