1084

url_string convert function with Vietnamese language

Comments for “url_string convert function with Vietnamese language”
 

Posted by quangquoc on Wednesday 29th June 2022 at 20:32 GMT

//string = "Tiêu đề tin tức 1";

function url_title($str, $make_lowercase=false) {
    $str = $make_lowercase == true ? trim(strtolower($str)) : trim($str);
    $str = preg_replace('/\s+/', ' ', $str);
    $str = preg_replace("/[^A-Za-z0-9 _]/", '', $str);
    $str = rawurlencode(utf8_encode($str));
    $str = preg_replace('/-+/', '-', $str);
    $str = str_replace("%20", '-', $str);
    return $str;
}
// Trongate function return: "tiu--tin-tc-1"


I replace with:
function convert_url($str) {
    $str = preg_replace("/(à|á|ạ|ả|ã|â|ầ|ấ|ậ|ẩ|ẫ|ă|ằ|ắ|ặ|ẳ|ẵ)/", 'a', $str);
    $str = preg_replace("/(è|é|ẹ|ẻ|ẽ|ê|ề|ế|ệ|ể|ễ)/", 'e', $str);
    $str = preg_replace("/(ì|í|ị|ỉ|ĩ)/", 'i', $str);
    $str = preg_replace("/(ò|ó|ọ|ỏ|õ|ô|ồ|ố|ộ|ổ|ỗ|ơ|ờ|ớ|ợ|ở|ỡ)/", 'o', $str);
    $str = preg_replace("/(ù|ú|ụ|ủ|ũ|ư|ừ|ứ|ự|ử|ữ)/", 'u', $str);
    $str = preg_replace("/(ỳ|ý|ỵ|ỷ|ỹ)/", 'y', $str);
    $str = preg_replace("/(đ)/", 'd', $str);
    $str = preg_replace("/(À|Á|Ạ|Ả|Ã|Â|Ầ|Ấ|Ậ|Ẩ|Ẫ|Ă|Ằ|Ắ|Ặ|Ẳ|Ẵ)/", 'A', $str);
    $str = preg_replace("/(È|É|Ẹ|Ẻ|Ẽ|Ê|Ề|Ế|Ệ|Ể|Ễ)/", 'E', $str);
    $str = preg_replace("/(Ì|Í|Ị|Ỉ|Ĩ)/", 'I', $str);
    $str = preg_replace("/(Ò|Ó|Ọ|Ỏ|Õ|Ô|Ồ|Ố|Ộ|Ổ|Ỗ|Ơ|Ờ|Ớ|Ợ|Ở|Ỡ)/", 'O', $str);
    $str = preg_replace("/(Ù|Ú|Ụ|Ủ|Ũ|Ư|Ừ|Ứ|Ự|Ử|Ữ)/", 'U', $str);
    $str = preg_replace("/(Ỳ|Ý|Ỵ|Ỷ|Ỹ)/", 'Y', $str);
    $str = preg_replace("/(Đ)/", 'D', $str);
    $str = preg_replace("/(\“|\”|\‘|\’|\,|\!|\&|\;|\@|\#|\%|\~|\`|\=|\_|\'|\]|\[|\}|\{|\)|\(|\+|\^)/", '-', $str);
    $str = preg_replace("/( )/", '-', $str);
    return $str;
}
// Return: "tieu-de-tin-tuc-1"
// It nice

I want to ask. Does it compromise security and break the Trongate framework?

quangquoc

User Level: Guest

Date Joined: 15/06/2022

Posted by djnordeen on Wednesday 29th June 2022 at 20:50 GMT

Hello,
Where are you proposing to use this function?
Early Adopter

djnordeen

User Level: Early Adopter

Date Joined: 20/08/2021

Posted by quangquoc on Wednesday 29th June 2022 at 22:11 GMT

Hi @djnordeen

function url_title($str, $make_lowercase=false) {
    $str = $make_lowercase == true ? trim(strtolower($str)) : trim($str);
    $str = preg_replace('/\s+/', ' ', $str);
    $str = preg_replace("/[^A-Za-z0-9 _]/", '', $str);
    $str = rawurlencode(utf8_encode($str));
    $str = preg_replace('/-+/', '-', $str);
    $str = str_replace("%20", '-', $str);
    return $str;
}


it's inside file url.php at location [project_trongate]/engine/tg_helpers/url.php

quangquoc

User Level: Guest

Date Joined: 15/06/2022

Posted by djnordeen on Wednesday 29th June 2022 at 22:33 GMT

If you change the name of the function, you would get an error on not finding the url_title not found.
If you want to change the function, then it is in every module you create with the Trongate Desktop app.
example: line 140 of a module: $data['url_string'] = strtolower(url_title($data['page_headline']));
also line 310 of fileazone module: $safe_file_name = url_title($file_name);

So I thing you will break the code.
Are you generating modules from the Code generator of the desktop app? If so, there is a url string that makes pretty urls.
Early Adopter

djnordeen

User Level: Early Adopter

Date Joined: 20/08/2021

Posted by quangquoc on Wednesday 29th June 2022 at 23:26 GMT

Haha, I didn't rename the function, but I put the new function below the url_title() function inside the framework file.
Yes, I just understood. My problem is a local one, Trongate is a global general purpose. I would separate the custom function from the framework function.

quangquoc

User Level: Guest

Date Joined: 15/06/2022

Posted by DaFa on Wednesday 29th June 2022 at 23:57 GMT

Xin chào,

I know you have already awarded Dan and this is solved for you.

As you might be aware, any changes made in the engine folder will be overwritten if you update the framework via the Desktop app.

I would then suggest you create a utility class of your own and add the following
    function _convert_url($str) {
        $str = preg_replace("/(à|á|ạ|ả|ã|â|ầ|ấ|ậ|ẩ|ẫ|ă|ằ|ắ|ặ|ẳ|ẵ)/", 'a', $str);
        $str = preg_replace("/(è|é|ẹ|ẻ|ẽ|ê|ề|ế|ệ|ể|ễ)/", 'e', $str);
        $str = preg_replace("/(ì|í|ị|ỉ|ĩ)/", 'i', $str);
        $str = preg_replace("/(ò|ó|ọ|ỏ|õ|ô|ồ|ố|ộ|ổ|ỗ|ơ|ờ|ớ|ợ|ở|ỡ)/", 'o', $str);
        $str = preg_replace("/(ù|ú|ụ|ủ|ũ|ư|ừ|ứ|ự|ử|ữ)/", 'u', $str);
        $str = preg_replace("/(ỳ|ý|ỵ|ỷ|ỹ)/", 'y', $str);
        $str = preg_replace("/(đ)/", 'd', $str);
        $str = preg_replace("/(À|Á|Ạ|Ả|Ã|Â|Ầ|Ấ|Ậ|Ẩ|Ẫ|Ă|Ằ|Ắ|Ặ|Ẳ|Ẵ)/", 'A', $str);
        $str = preg_replace("/(È|É|Ẹ|Ẻ|Ẽ|Ê|Ề|Ế|Ệ|Ể|Ễ)/", 'E', $str);
        $str = preg_replace("/(Ì|Í|Ị|Ỉ|Ĩ)/", 'I', $str);
        $str = preg_replace("/(Ò|Ó|Ọ|Ỏ|Õ|Ô|Ồ|Ố|Ộ|Ổ|Ỗ|Ơ|Ờ|Ớ|Ợ|Ở|Ỡ)/", 'O', $str);
        $str = preg_replace("/(Ù|Ú|Ụ|Ủ|Ũ|Ư|Ừ|Ứ|Ự|Ử|Ữ)/", 'U', $str);
        $str = preg_replace("/(Ỳ|Ý|Ỵ|Ỷ|Ỹ)/", 'Y', $str);
        $str = preg_replace("/(Đ)/", 'D', $str);
        $str = preg_replace("/(\“|\”|\‘|\’|\,|\!|\&|\;|\@|\#|\%|\~|\`|\=|\_|\'|\]|\[|\}|\{|\)|\(|\+|\^)/", '-', $str);
        $str = preg_replace("/( )/", '-', $str);
        return $str;
    }

    function url_title_viet($string, $make_lowercase = false) {
        $string = $make_lowercase == true ? trim(strtolower($string)) : trim($string);
        $string = $this->_convert_url($string);
        $string = preg_replace('/\s+/', ' ', $string);
        // $string = preg_replace("/[^A-Za-z0-9 _]/", '', $string);
        $string = rawurlencode(utf8_encode($string));
        $string = preg_replace('/-+/', '-', $string);
        $string = str_replace("%20", '-', $string);
        return $string;
    }


and invoke it anywhere in your code with:
        $string = "Tiêu đề tin tức";
        echo url_title_viet($string, true);

output:
tieu-de-tin-tuc

This comment was edited by DaFa on Thursday 30th June 2022 at 07:15 GMT

Founding Member

DaFa

User Level: Founding Member

Date Joined: 30/11/2018

Posted by quangquoc on Thursday 30th June 2022 at 03:13 GMT

Thank you @DaFa
All was clear.
And this full answer should be the answer.

quangquoc

User Level: Guest

Date Joined: 15/06/2022

Posted by Mujaffar on Monday 11th July 2022 at 11:19 GMT

In my case... this is how i solve it and it works with all non latin characters.

I just replaced the url_title function with this one:

function url_title($value, $transliteration = true) {
    if (extension_loaded('intl') && $transliteration == true) {
        $transliterator = \Transliterator::create('Any-Latin; Latin-ASCII');
        $value = $transliterator->transliterate($value);
    } 
    $slug = html_entity_decode($value, ENT_QUOTES, 'UTF-8');
    // replace non letter or digits by -
    $slug = preg_replace('~[^\pL\d]+~u', '-', $slug);
    // trim
    $slug = trim($slug, '-');
    $slug = strtolower($slug);
    return $slug;
}

Mujaffar

User Level: Guest

Date Joined: 11/08/2022

Posted by DaFa on Tuesday 12th July 2022 at 03:32 GMT

Hi framex,

Very nice - I like it 👍

I see when the 'intl' php extension is not being loaded, quangquoc returned string is:
tiêu-đề-tin-tức

otherwise - if is it being loaded in php.ini
extension=intl

the returned string is:
tieu-de-tin-tuc

which is exactly what quangquoc was after - kudos to you! and in my opinion, your code is worthy of a push request to replace the 'url_title() method in engine/tg_helpers/url.php in the framework and quangquoc to shift the solution points to you 😊
Founding Member

DaFa

User Level: Founding Member

Date Joined: 30/11/2018

Posted by quangquoc on Wednesday 13th July 2022 at 06:46 GMT

Hi framex,

Your code is great.
It saves ~20 lines.

And up to this point, as everyone can see, I believe this is the most concise and beautiful solution for my problem.

I will try with more text and report the results here

Thanks to @djnordeen, @DaFa, @framex again.

quangquoc

User Level: Guest

Date Joined: 15/06/2022

×